Commit Graph

81366 Commits

Author SHA1 Message Date
Eric Blake
ebd57062a1 nbd: Simplify meta-context parsing
We had a premature optimization of trying to read as little from the
wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases.
But in reality, we HAVE to read the entire string from the client
before we can get to the next command, and it is easier to just read
it all at once than it is to read it in pieces.  And once we do that,
several functions end up no longer performing I/O, so they can drop
length and errp parameters, and just return a bool instead of
modifying through a pointer.

Our iotests still pass; I also checked that libnbd's testsuite (which
covers more corner cases of odd meta context requests) still passes.
There are cases where the sequence of trace messages produced differs
(for example, when no bitmap is exported, a query for "qemu:" now
produces two trace lines instead of one), but trace points are for
debug and have no effect on what the client sees.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200930121105.667049-4-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: enhance commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:05:08 -05:00
Eric Blake
d1e2c3e7bd nbd/server: Reject embedded NUL in NBD strings
The NBD spec is clear that any string sent from the client must not
contain embedded NUL characters.  If the client passes "a\0", we
should reject that option request rather than act on "a".

Testing this is not possible with a compliant client, but I was able
to use gdb to coerce libnbd into temporarily behaving as such a
client.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200930121105.667049-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:05:08 -05:00
Eric Blake
029a88c9a7 qemu-nbd: Honor SIGINT and SIGHUP
Honoring just SIGTERM on Linux is too weak; we also want to handle
other common signals, and do so even on BSD.  Why?  Because at least
'qemu-nbd -B bitmap' needs a chance to clean up the in-use bit on
bitmaps when the server is shut down via a signal.

See also: http://bugzilla.redhat.com/1883608

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200930121105.667049-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: apply comment tweak suggested by Vladimir; fix ifdef around
termsig_handler]
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:05:04 -05:00
Vladimir Sementsov-Ogievskiy
99d72dba1c block/nbd: nbd_co_reconnect_loop(): don't connect if drained
In a recent commit 12c75e20a2 we've improved
nbd_co_reconnect_loop() to not make drain wait for additional sleep.
Similarly, we shouldn't try to connect, if previous sleep was
interrupted by drain begin, otherwise drain_begin will have to wait for
the whole connection attempt.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200903190301.367620-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:04:32 -05:00
Vladimir Sementsov-Ogievskiy
46f56631b5 block/nbd: fix reconnect-delay
reconnect-delay has a design flaw: we handle it in the same loop where
we do connection attempt. So, reconnect-delay may be exceeded by
unpredictable time of connection attempt.

Let's instead use separate timer.

How to reproduce the bug:

1. Create an image on node1:
   qemu-img create -f qcow2 xx 100M

2. Start NBD server on node1:
   qemu-nbd xx

3. On node2 start qemu-io:

./build/qemu-io --image-opts \
driver=nbd,server.type=inet,server.host=192.168.100.5,server.port=10809,reconnect-delay=15

4. Type 'read 0 512' in qemu-io interface to check that connection
   works

Be careful: you should make steps 5-7 in a short time, less than 15
seconds.

5. Kill nbd server on node1

6. Run 'read 0 512' in qemu-io interface again, to be sure that nbd
client goes to reconnect loop.

7. On node1 run the following command

   sudo iptables -A INPUT -p tcp --dport 10809 -j DROP

This will make the connect() call of qemu-io at node2 take a long time.

And you'll see that read command in qemu-io will hang for a long time,
more than 15 seconds specified by reconnect-delay parameter. It's the
bug.

8. Don't forget to drop iptables rule on node1:

   sudo iptables -D INPUT -p tcp --dport 10809 -j DROP

Important note: Step [5] is necessary to reproduce _this_ bug. If we
miss step [5], the read command (step 6) will hang for a long time and
this commit doesn't help, because there will be not long connect() to
unreachable host, but long sendmsg() to unreachable host, which should
be fixed by enabling and adjusting keep-alive on the socket, which is a
thing for further patch set.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200903190301.367620-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:04:32 -05:00
Vladimir Sementsov-Ogievskiy
8a509afd72 block/nbd: correctly use qio_channel_detach_aio_context when needed
Don't use nbd_client_detach_aio_context() driver handler where we want
to finalize the connection. We should directly use
qio_channel_detach_aio_context() in such cases. Driver handler may (and
will) contain another things, unrelated to the qio channel.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200903190301.367620-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:04:32 -05:00
Vladimir Sementsov-Ogievskiy
8c517de24a block/nbd: fix drain dead-lock because of nbd reconnect-delay
We pause reconnect process during drained section. So, if we have some
requests, waiting for reconnect we should cancel them, otherwise they
deadlock the drained section.

How to reproduce:

1. Create an image:
   qemu-img create -f qcow2 xx 100M

2. Start NBD server:
   qemu-nbd xx

3. Start vm with second nbd disk on node2, like this:

  ./build/x86_64-softmmu/qemu-system-x86_64 -nodefaults -drive \
     file=/work/images/cent7.qcow2 -drive \
     driver=nbd,server.type=inet,server.host=192.168.100.5,server.port=10809,reconnect-delay=60 \
     -vnc :0 -m 2G -enable-kvm -vga std

4. Access the vm through vnc (or some other way?), and check that NBD
   drive works:

   dd if=/dev/sdb of=/dev/null bs=1M count=10

   - the command should succeed.

5. Now, kill the nbd server, and run dd in the guest again:

   dd if=/dev/sdb of=/dev/null bs=1M count=10

Now Qemu is trying to reconnect, and dd-generated requests are waiting
for the connection (they will wait up to 60 seconds (see
reconnect-delay option above) and than fail). But suddenly, vm may
totally hang in the deadlock. You may need to increase reconnect-delay
period to catch the dead-lock.

VM doesn't respond because drain dead-lock happens in cpu thread with
global mutex taken. That's not good thing by itself and is not fixed
by this commit (true way is using iothreads). Still this commit fixes
drain dead-lock itself.

Note: probably, we can instead continue to reconnect during drained
section. To achieve this, we may move negotiation to the connect thread
to make it independent of bs aio context. But expanding drained section
doesn't seem good anyway. So, let's now fix the bug the simplest way.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200903190301.367620-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:04:32 -05:00
Christian Borntraeger
bbc35fc20e nbd: silence maybe-uninitialized warnings
gcc 10 from Fedora 32 gives me:

Compiling C object libblock.fa.p/nbd_server.c.o
../nbd/server.c: In function ‘nbd_co_client_start’:
../nbd/server.c:625:14: error: ‘namelen’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  625 |         rc = nbd_negotiate_send_info(client, NBD_INFO_NAME, namelen, name,
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  626 |                                      errp);
      |                                      ~~~~~
../nbd/server.c:564:14: note: ‘namelen’ was declared here
  564 |     uint32_t namelen;
      |              ^~~~~~~
cc1: all warnings being treated as errors

As I cannot see how this can happen, let uns silence the warning.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20200930155859.303148-3-borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:04:32 -05:00
Alex Bennée
e5d402b28f tests/acceptance: disable machine_rx_gdbsim on GitLab
While I can get the ssh test to fail on my test setup this seems a lot
more stable except when on GitLab. Hopefully we can re-enable both
once the serial timing patches have been added.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Willian Rampazzo <willianr@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Message-Id: <20201007160038.26953-23-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Paolo Bonzini
2a5a79d1b5 cirrus: use V=1 when running tests on FreeBSD and macOS
Using "V=1" makes it easier to identify hanging tests, especially
since they are run with -j1.  It is already used on Windows builds,
do the same for FreeBSD and macOS.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Ed Maste <emaste@FreeBSD.org>
Message-Id: <20201007140103.711142-1-pbonzini@redhat.com>
Message-Id: <20201007160038.26953-22-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Yonggang Luo
27d891bca9 plugin: Fixes compiling errors on msys2/mingw
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201001163429.1348-3-luoyonggang@gmail.com>
Message-Id: <20201007160038.26953-21-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Yonggang Luo
b31371004f plugins: Fixes a issue when dlsym failed, the handle not closed
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201001163429.1348-2-luoyonggang@gmail.com>
Message-Id: <20201007160038.26953-20-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
98d3a72469 .mailmap: Fix more contributor entries
These authors have some incorrect author email field.
For each of them, there is one commit with the replaced
entry.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Erik Smit <erik.lucas.smit@gmail.com>
Message-Id: <20201006160653.2391972-13-f4bug@amsat.org>
Message-Id: <20201007160038.26953-19-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
311a73a964 contrib/gitdm: Add Yandex to the domain map
There is a number of contributors from this domain,
add its own entry to the gitdm domain map.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20201006160653.2391972-12-f4bug@amsat.org>
Message-Id: <20201007160038.26953-18-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
0f53854572 contrib/gitdm: Add Yadro to the domain map
There is a number of contributions from this domain,
add its own entry to the gitdm domain map.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20201006160653.2391972-11-f4bug@amsat.org>
Message-Id: <20201007160038.26953-17-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
99b19335f4 contrib/gitdm: Add SUSE to the domain map
There is a number of contributors from this domain,
add its own entry to the gitdm domain map.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Bruce Rogers <brogers@suse.com>
Message-Id: <20201006160653.2391972-10-f4bug@amsat.org>
Message-Id: <20201007160038.26953-16-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
0d056af514 contrib/gitdm: Add Nir Soffer to Red Hat domain
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Nir Soffer <nirsof@gmail.com>
Message-Id: <20201006160653.2391972-9-f4bug@amsat.org>
Message-Id: <20201007160038.26953-15-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
3b3453f2dc contrib/gitdm: Add Qualcomm to the domain map
There is a number of contributions from this domain,
add its own entry to the gitdm domain map.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <20201006160653.2391972-8-f4bug@amsat.org>
Message-Id: <20201007160038.26953-14-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
0705260b55 contrib/gitdm: Add Nuvia to the domain map
There is a number of contributions from this domain,
add its own entry to the gitdm domain map.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Graeme Gregory <graeme@nuviainc.com>
Message-Id: <20201006160653.2391972-7-f4bug@amsat.org>
Message-Id: <20201007160038.26953-13-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
4766a2b227 contrib/gitdm: Add Google to the domain map
There is a number of contributors from this domain,
add its own entry to the gitdm domain map.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Erik Kline <ek@google.com>
Message-Id: <20201006160653.2391972-6-f4bug@amsat.org>
Message-Id: <20201007160038.26953-12-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
2f8cdb7672 contrib/gitdm: Add ByteDance to the domain map
There is a number of contributors from this domain,
add its own entry to the gitdm domain map.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
Message-Id: <20201006160653.2391972-5-f4bug@amsat.org>
Message-Id: <20201007160038.26953-11-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
2ba17f9760 contrib/gitdm: Add Baidu to the domain map
There is a number of contributors from this domain,
add its own entry to the gitdm domain map.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Chai Wen <chaiwen@baidu.com>
Message-Id: <20201006160653.2391972-4-f4bug@amsat.org>
Message-Id: <20201007160038.26953-10-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
da568cc906 contrib/gitdm: Add more individual contributors
These individual contributors have a number of contributions,
add them to the 'individual' group map.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Helge Deller <deller@gmx.de>
Acked-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Acked-by: David Carlier <devnexen@gmail.com>
Acked-by: Paul Zimmerman <pauldzim@gmail.com>
Acked-by: Volker Rümelin <vr_qemu@t-online.de>
Acked-by: Finn Thain <fthain@telegraphics.com.au>
Message-Id: <20201006160653.2391972-3-f4bug@amsat.org>
Message-Id: <20201007160038.26953-9-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
33955b5672 contrib/gitdm: Add more academic domains
There is a number of contributions from these academic domains.
Add the entries to the gitdm 'academic' domain map.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Dayeol Lee <dayeol@berkeley.edu>
Acked-by: Fan Yang <Fan_Yang@sjtu.edu.cn>
Acked-by: Xinyu Li <precinct@mail.ustc.edu.cn>
Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20201006160653.2391972-2-f4bug@amsat.org>
Message-Id: <20201007160038.26953-8-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Thomas Huth
7e86e5d5cc tests/docker: Add genisoimage to the docker file
genisoimage is needed for running the tests/qtest/cdrom-test qtest.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20201006174347.152040-1-thuth@redhat.com>
Message-Id: <20201007160038.26953-7-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Yonggang Luo
5eb691df5a cirrus: msys2/mingw speed is up, add excluded target back
The following target are add back:
i386-softmmu,arm-softmmu,ppc-softmmu,mips-softmmu

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20201007145300.1197-3-luoyonggang@gmail.com>
Message-Id: <20201007160038.26953-6-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Yonggang Luo
0026b33992 cirrus: Fixing and speedup the msys2/mingw CI
Use cache of cirrus caching msys2
The install of msys2 are refer to https://github.com/msys2/setup-msys2
The first time install msys2 would be time consuming, so increase timeout_in to 90m
according to https://cirrus-ci.org/faq/#instance-timed-out

Apply patch of https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg00072.html

[AJB: renamed printenv_script to setup_script]

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20201007145300.1197-2-luoyonggang@gmail.com>
Message-Id: <20201007160038.26953-5-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Alex Bennée
de00b8b376 hw/ide: restore replay support of IDE
A recent change to weak reset handling broke replay due to the use of
aio_bh_schedule_oneshot instead of the replay aware
replay_bh_schedule_oneshot_event.

Fixes: 55adb3c456 ("ide: cancel pending callbacks on SRST")
Suggested-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: John Snow <jsnow@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-Id: <20201007160038.26953-4-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Philippe Mathieu-Daudé
42a052333a hw/misc/mips_cpc: Start vCPU when powered on
In commit 102ca9667d we set "start-powered-off" on all vCPUs
included in the CPS (Coherent Processing System) but forgot to
start the vCPUS on when they are powered on in the CPC (Cluster
Power Controller).

This fixes the following tests:

  $ avocado run tests/acceptance/machine_mips_malta.py
   (1/3) test_mips_malta_i6400_framebuffer_logo_1core: PASS (3.67 s)
   (2/3) test_mips_malta_i6400_framebuffer_logo_7cores: INTERRUPTED: Test interrupted by SIGTERM (30.22 s)
   (3/3) test_mips_malta_i6400_framebuffer_logo_8cores: INTERRUPTED: Test interrupted by SIGTERM (30.25 s)
  RESULTS    : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 2 | CANCEL 0

Fixes: 102ca9667d ("mips/cps: Use start-powered-off CPUState property")
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20201007113942.2523866-1-f4bug@amsat.org>
Message-Id: <20201007160038.26953-3-alex.bennee@linaro.org>
2020-10-09 17:27:55 +01:00
Paolo Bonzini
0c3dd50eae configure: fix performance regression due to PIC objects
Because most files in QEMU are grouped into static libraries, Meson conservatively
compiles them with -fPIC.  This is overkill and produces slowdowns up to 20% on
some TCG tests.

As a stopgap measure, use the b_staticpic option to limit the slowdown to
--enable-pie.  https://github.com/mesonbuild/meson/pull/7760 will allow
us to use b_staticpic=false and let Meson do the right thing.

Reported-by: Ahmed Karaman <ahmedkrmn@outlook.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20200924092314.1722645-57-pbonzini@redhat.com>
Message-Id: <20201007160038.26953-2-alex.bennee@linaro.org>
2020-10-09 17:27:50 +01:00
Peter Maydell
4a7c0bd9dc ppc patch queue 2020-10-09
Here's the next set of ppc related patches for qemu-5.2.  There are
 two main things here:
 
 * Cleanups to error handling in spapr from Greg Kurz
 * Improvements to NUMA handling for spapr from Daniel Barboza
 
 There are also a handful of other bugfixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl9//PUACgkQbDjKyiDZ
 s5KdQQ/9EKl8GRuNw1CaoMRZFnD5YCDnr6Piy24HpcINHm8khvC4SWEaMm2ESOLU
 J5e9rQn2vXlHLWDA0qQ8pTTEMqfgAOuYllGQXTnTKF3tjePEZzsYzdg49v8O3dsb
 EHOAvixsHocH+8KMsiQkbV5BZYEEJukX6RoGGm6vte+MTXdRlpyxmp9Xf52tGmEB
 pU/Q2Y9oLR6OW7POWv3kfpmCfxklkOXstguEMTP42+ZGP17PBvpKXAXfW13gCl8t
 yGvvcjWr64m9uTyqTxYWK/jFxxYa8hraKPk4BY/001UCypd+T/DrD7E/xlBMZwPh
 eDRX7fV+YPcRqv66x47Gu40afEVm3mlQXzr0QaK5qm772f+v6C/xyLUznLNxYdLy
 s9lKSi7wSxjBS8M8jztRoCJEx+zVe6BclJbwdzGQMYODiY13HKVENFUzPxrC9bfN
 IxYAU3uAN3VL/agslEYV+aBrX0qj96c1Ek6CcFG2XXdR3k9QnYvUcQuPKcfuCBSX
 TVS2mYger8Ba4E47tapH++TKj5jHoVKgTciSN663+gUCGzNTw+5UEZBxEHTQaPOX
 a5SKh5t06PEkxpBK4ITnQfeRwvkMg4ERjJoKPXWzcqvHUWK+xaI8XbBlqCDMiC3T
 mBAVHMIrKEe6J9tTqeURyct3ItUioneueLWNSplBUN3BPkE+7AQ=
 =dbvK
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-5.2-20201009' into staging

ppc patch queue 2020-10-09

Here's the next set of ppc related patches for qemu-5.2.  There are
two main things here:

* Cleanups to error handling in spapr from Greg Kurz
* Improvements to NUMA handling for spapr from Daniel Barboza

There are also a handful of other bugfixes.

# gpg: Signature made Fri 09 Oct 2020 07:02:29 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-5.2-20201009:
  specs/ppc-spapr-numa: update with new NUMA support
  spapr_numa: consider user input when defining associativity
  spapr_numa: change reference-points and maxdomain settings
  spapr_numa: forbid asymmetrical NUMA setups
  spapr: add spapr_machine_using_legacy_numa() helper
  ppc/pnv: Increase max firmware size
  spapr: Add a return value to spapr_check_pagesize()
  spapr: Add a return value to spapr_nvdimm_validate()
  spapr: Simplify error handling in spapr_cpu_core_realize()
  spapr: Add a return value to spapr_set_vcpu_id()
  spapr: Simplify error handling in prop_get_fdt()
  spapr: Add a return value to spapr_drc_attach()
  spapr: Simplify error handling in spapr_vio_busdev_realize()
  spapr: Simplify error handling in do_client_architecture_support()
  spapr: Get rid of cas_check_pvr() error reporting
  spapr: Simplify error handling in callers of ppc_set_compat()
  ppc: Fix return value in cpu_post_load() error path
  ppc: Add a return value to ppc_set_compat() and ppc_set_compat_all()
  spapr: Fix error leak in spapr_realize_vcpu()
  spapr: Handle HPT allocation failure in nested guest

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-10-09 15:48:04 +01:00
Peter Maydell
e1c30c43cd Error reporting patches for 2020-10-09
-----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAl+ABv0SHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZT+5EP9ibVF/L1Nu7jnqMt1kRn1BU308XUoGL7
 2hAZ7ys+agQd6DUttVBSwi7apZueXznV32E27FggncvCWqh2PeHn+6oCdBpSIKEV
 8AX8igui8DNsh0xd2l1JjVj0qNWYU1ZK9vTsMXdP3Ha+eqzDpoINOQKWtLYGKAxX
 mefk6wldGo2bvSw0kkXQ6fgYHz0stztKF1YCpTmktqZIiLEvLZ+clKCsTWXOpGZc
 sJhZnMBGejETNrPMevwhcy+BWAOp6k14fFlM4adfJHbwxkLYv2jr36MPAqAo894w
 KJK7tYWhGdrFSvx6e6jMdoRynJ8R5uHbw2Z4Xadx8VT3h+I+hm4AXjAiIdWIxlGn
 lNxzGJPvhXJC0uEOOeQthL+//IGbbkvo7dvEMLirZf6IT/Lbcyp5p1eyv409ShwQ
 KL3OOYRj3YMxDKe/Vxdt3B9Q/B2NXcQmuCF29eZMa1RwCQuEqHIXfpIy7HAQNPxH
 bAlGa+THnHrdWeir6F1tNABFjwDMHLep+HlnikCEL1DB25iDVKloary77VL2GSt3
 A4BQsikYwyUa7Cvok5r/rUgQ4akSegt28s6VQlIcocSf9tf91wTo2OaVWEXbDXdQ
 M7V27usT38x0qiZkUvajdfZ1erfXZ7p3/xnJmdg3BtaiB83VOEjz7VT2P5+beF7Y
 HtDs58b+wcE=
 =e5I5
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2020-10-09' into staging

Error reporting patches for 2020-10-09

# gpg: Signature made Fri 09 Oct 2020 07:45:17 BST
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-error-2020-10-09:
  error: Use error_fatal to simplify obvious fatal errors (again)
  error: Remove NULL checks on error_propagate() calls (again)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-10-09 14:47:45 +01:00
Peter Maydell
b7092cda1b Monitor patches for 2020-10-09
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAl9/8kMSHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZTSa0P/3Ko/KPXbI6EkCUCicLBLpChbgAfsPsa
 y+cb/m2kiBomnoIhyDVnhGBhNXUan/A8V3lCkGL/f3/CQ25roODPWW9jt6oQ2fkR
 xVqgxPuTbfNzjNS6wR9PpEp2urFgULJ9lv7MxKsxj6Arv3PoxQTirbXzp1uFGN2D
 q01r3QSB/8xkYgZx3/XwAcTGKeqeFeRJs8+RPU4XqNH274RDi3tTQo9BJKucI24D
 7BDJSSnSYFU4mDtVUJKiaoWaWax5y7T0mNDZQ0MqYK59ICToPwRgfeyfjSzFAC9J
 LucPFDyOBXHyzywL4WqH8ZUe5eKpWUflJdt7twoeNtZiMhezvH9f6U1PR8rZOtWN
 mrhygkugiChSQOLOYqPboEaM5QCUptEQDBSJ9fjs7j+YEm2wwwIHSBxev41aAAZ7
 8i20hk4ymBrBCyASHmnOfYogNFqthOi5KrksO6L8tTIubLYXl4a1LKRCuncrqm1/
 3rREaSN4ghXcowuxuxvSAEmLoGnxr3ylTyS4yT/6bqCURBANKntEuXd9PN4GPtpM
 obVuLNQnQDWIKdp8tQe6P4QeeO5A9kJPCxTWgPPwTNrZnKkVM8X4LU68/qvZQ2A0
 Cd08fAtllF+XSqEb/7Y2BaCwtEjz3oxyPiCYwpg62MNJ0Sasvoqd9GPfbpQygHDF
 WnLT461HbufA
 =LexB
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2020-10-09' into staging

Monitor patches for 2020-10-09

# gpg: Signature made Fri 09 Oct 2020 06:16:51 BST
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-monitor-2020-10-09:
  block: Convert 'block_resize' to coroutine
  block: Add bdrv_lock()/unlock()
  block: Add bdrv_co_enter()/leave()
  util/async: Add aio_co_reschedule_self()
  hmp: Add support for coroutine command handlers
  qmp: Move dispatcher to a coroutine
  qapi: Add a 'coroutine' flag for commands
  monitor: Make current monitor a per-coroutine property
  qmp: Call monitor_set_cur() only in qmp_dispatch()
  qmp: Assert that no other monitor is active
  hmp: Update current monitor only in handle_hmp_command()
  monitor: Use getter/setter functions for cur_mon
  monitor: Add Monitor parameter to monitor_get_cpu_index()
  monitor: Add Monitor parameter to monitor_set_cpu()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-10-09 13:20:46 +01:00
Markus Armbruster
805d44961b error: Use error_fatal to simplify obvious fatal errors (again)
Patch created mechanically by rerunning:

    $ spatch --in-place --sp-file scripts/coccinelle/use-error_fatal.cocci \
	     --macro-file scripts/cocci-macro-file.h --use-gitgrep .

Variables now unused dropped manually.

Cc: Eric Auger <eric.auger@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200722084048.1726105-5-armbru@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2020-10-09 08:36:23 +02:00
Markus Armbruster
2155ceaf25 error: Remove NULL checks on error_propagate() calls (again)
Patch created mechanically by rerunning:

    $ spatch --sp-file scripts/coccinelle/error_propagate_null.cocci \
             --macro-file scripts/cocci-macro-file.h \
             --use-gitgrep .

Cc: Jens Freimann <jfreimann@redhat.com>
Cc: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200722084048.1726105-4-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-10-09 08:36:23 +02:00
Kevin Wolf
eb94b81a94 block: Convert 'block_resize' to coroutine
block_resize performs some I/O that could potentially take quite some
time, so use it as an example for the new 'coroutine': true annotation
in the QAPI schema.

bdrv_truncate() requires that we're already in the right AioContext for
the BlockDriverState if called in coroutine context. So instead of just
taking the AioContext lock, move the QMP handler coroutine to the
context.

Call blk_unref() only after switching back because blk_unref() may only
be called in the main thread.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201005155855.256490-15-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:20 +02:00
Kevin Wolf
18c6ac1c6e block: Add bdrv_lock()/unlock()
Inside of coroutine context, we can't directly use aio_context_acquire()
for the AioContext of a block node because we already own the lock of
the current AioContext and we need to avoid double locking to prevent
deadlocks.

This provides helper functions to lock the AioContext of a node only if
it's not the same as the current AioContext.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201005155855.256490-14-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:20 +02:00
Kevin Wolf
e336fd4c4b block: Add bdrv_co_enter()/leave()
Add a pair of functions to temporarily move the current coroutine to the
AioContext of a given BlockDriverState.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201005155855.256490-13-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:20 +02:00
Kevin Wolf
26b0b698c0 util/async: Add aio_co_reschedule_self()
Add a function that can be used to move the currently running coroutine
to a different AioContext (and therefore potentially a different
thread).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20201005155855.256490-12-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:20 +02:00
Kevin Wolf
bb4b9ead95 hmp: Add support for coroutine command handlers
Often, QMP command handlers are not only called to handle QMP commands,
but also from a corresponding HMP command handler. In order to give them
a consistent environment, optionally run HMP command handlers in a
coroutine, too.

The implementation is a lot simpler than in QMP because for HMP, we
still block the VM while the coroutine is running.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20201005155855.256490-11-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:20 +02:00
Kevin Wolf
9ce44e2ce2 qmp: Move dispatcher to a coroutine
This moves the QMP dispatcher to a coroutine and runs all QMP command
handlers that declare 'coroutine': true in coroutine context so they
can avoid blocking the main loop while doing I/O or waiting for other
events.

For commands that are not declared safe to run in a coroutine, the
dispatcher drops out of coroutine context by calling the QMP command
handler from a bottom half.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20201005155855.256490-10-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:20 +02:00
Kevin Wolf
04f22362f1 qapi: Add a 'coroutine' flag for commands
This patch adds a new 'coroutine' flag to QMP command definitions that
tells the QMP dispatcher that the command handler is safe to be run in a
coroutine.

The documentation of the new flag pretends that this flag is already
used as intended, which it isn't yet after this patch. We'll implement
this in another patch in this series.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201005155855.256490-9-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:19 +02:00
Kevin Wolf
e69ee454b5 monitor: Make current monitor a per-coroutine property
This way, a monitor command handler will still be able to access the
current monitor, but when it yields, all other code code will correctly
get NULL from monitor_cur().

This uses a hash table to map the coroutine pointer to the current
monitor of that coroutine.  Outside of coroutine context, we associate
the current monitor with the leader coroutine of the current thread.

Approaches to implement some form of coroutine local storage directly in
the coroutine core code have been considered and discarded because they
didn't end up being much more generic than the hash table and their
performance impact on coroutines not using coroutine local storage was
unclear. As the block layer uses a coroutine per I/O request, this is a
fast path and we have to be careful. It's safest to just stay out of
this path with code only used by the monitor.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201005155855.256490-8-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:19 +02:00
Kevin Wolf
41725fa7ed qmp: Call monitor_set_cur() only in qmp_dispatch()
The correct way to set the current monitor for a coroutine handler will
be different than for a blocking handler, so monitor_set_cur() needs to
be called in qmp_dispatch().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201005155855.256490-7-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:19 +02:00
Kevin Wolf
57d3635e42 qmp: Assert that no other monitor is active
monitor_qmp_dispatch() is never supposed to be called in the context of
another monitor, so assert that monitor_cur() is NULL instead of saving
and restoring it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201005155855.256490-6-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:19 +02:00
Kevin Wolf
ff04108a0e hmp: Update current monitor only in handle_hmp_command()
The current monitor is updated relatively early in the command handling
code even though only the command handler actually needs it.

The current monitor will become coroutine-local later, so we can only
update it when we know in which coroutine the command will be exectued.
Move it to handle_hmp_command() where this information will be
available.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201005155855.256490-5-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:19 +02:00
Kevin Wolf
947e47448d monitor: Use getter/setter functions for cur_mon
cur_mon really needs to be coroutine-local as soon as we move monitor
command handlers to coroutines and let them yield. As a first step, just
remove all direct accesses to cur_mon so that we can implement this in
the getter function later.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201005155855.256490-4-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:19 +02:00
Kevin Wolf
87e6f4a4d6 monitor: Add Monitor parameter to monitor_get_cpu_index()
Most callers actually don't have to rely on cur_mon, but already know
for which monitor they call monitor_get_cpu_index().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201005155855.256490-3-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:19 +02:00
Kevin Wolf
dcba65f824 monitor: Add Monitor parameter to monitor_set_cpu()
Most callers actually don't have to rely on cur_mon, but already know
for which monitor they call monitor_set_cpu().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201005155855.256490-2-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09 07:08:19 +02:00
Daniel Henrique Barboza
307e7a34dc specs/ppc-spapr-numa: update with new NUMA support
This update provides more in depth information about the
choices and drawbacks of the new NUMA support for the
spapr machine.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20201007172849.302240-6-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-09 15:06:14 +11:00