Commit Graph

12569 Commits

Author SHA1 Message Date
Alex Bennée
3918fe16b0 docs/devel: more documentation on the use of suffixes
Using _qemu is a little confusing. Let's use _compat for these sorts
of things. We should also mention _impl which is another common suffix
in the code base.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20220105135009.1584676-25-alex.bennee@linaro.org>
2022-01-18 16:42:42 +00:00
Alex Bennée
33973e1e1f hw/arm: add control knob to disable kaslr_seed via DTB
Generally a guest needs an external source of randomness to properly
enable things like address space randomisation. However in a trusted
boot environment where the firmware will cryptographically verify
components having random data in the DTB will cause verification to
fail. Add a control knob so we can prevent this being added to the
system DTB.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Jerome Forissier <jerome@forissier.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20220105135009.1584676-22-alex.bennee@linaro.org>
2022-01-18 16:42:42 +00:00
Daniel P. Berrangé
021e3fa33b ui: avoid warnings about directdb on Alpine / musl libc
On Alpine, SDL is built with directfb support and this triggers warnings
during QEMU build

In file included from /usr/include/directfb/direct/thread.h:38,
                 from /usr/include/directfb/direct/debug.h:43,
                 from /usr/include/directfb/direct/interface.h:36,
                 from /usr/include/directfb/directfb.h:49,
                 from /usr/include/SDL2/SDL_syswm.h:80,
                 from /builds/berrange/qemu/include/ui/sdl2.h:8,
                 from ../ui/sdl2-gl.c:31:
/usr/include/directfb/direct/os/waitqueue.h:41:25: error: redundant redeclaration of 'direct_waitqueue_init' [-Werror=redundant-decls]
   41 | DirectResult DIRECT_API direct_waitqueue_init        ( DirectWaitQueue *queue );
      |                         ^~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20211215141949.3512719-5-berrange@redhat.com>
Message-Id: <20220105135009.1584676-5-alex.bennee@linaro.org>
2022-01-18 16:42:41 +00:00
John Snow
9dcafa400e spice: Update QXLInterface for spice >= 0.15.0
spice updated the spelling (and arguments) of "attache_worker" in
0.15.0. Update QEMU to match, preventing -Wdeprecated-declarations
compilations from reporting build errors.

See also:
974692bda1

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20211215141949.3512719-3-berrange@redhat.com>
Message-Id: <20220105135009.1584676-3-alex.bennee@linaro.org>
2022-01-18 16:42:41 +00:00
Peter Maydell
1cd2ad11d3 Block layer patches
- qemu-storage-daemon: Add vhost-user-blk help
 - block-backend: Fix use-after-free for BDS pointers after aio_poll()
 - qemu-img: Fix sparseness of output image with unaligned ranges
 - vvfat: Fix crashes in read-write mode
 - Fix device deletion events with -device JSON syntax
 - Code cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmHhf5gRHGt3b2xmQHJl
 ZGhhdC5jb20ACgkQfwmycsiPL9YBMA//ZkaIigVsfjRoeUh2MccgOuvYpXZtq4po
 q7l6AwGLbBpTt5Fy468gYhwmXuwHCapTMRmvWf6mpb86jtJ6vdbE16L0Z4/Z9iiW
 C0w69fsAAP9XyI+f7Q5FNtzz3jWztKowgyhkU33izbwYM7dm5Xw1q5bDkOiIBNoO
 d8cdxLC1oQGEWJmGLgmbaM/ow0iDogFpT8zU5j0VE3uK01si8pblWlXm1SM3nOK9
 b4uROqKYsTzTny/zX7KxD4SX3UGKYK393rQxr5HdmTiW14uGfB+EVfBxJmn07Qch
 lWM/v9tYoP1aVbR6IL5osAQdmbDYX0zsRMq5UA+dQ6OqnE3GpluVrYIFoaUSoShf
 S704hYdWgO0sKfpAYgJgGo6y0mglnp9Z7xO4Ng3XUNj0gvfgnOe3CdCdXIOeTFwC
 eP+KlFvbUT2xpTqI6ttBgKCcwKHA3hgWCnlo39C80bL1ZVKWSqh6zORfwmptouQ3
 BmuhEqZRyoYrknrTELN+lIKK2gP6MLup/ymeXWOOOE58KSpmrdeBAXmgJNXX3ucx
 lAWGsIz0CxdaKQoZpKpikho4rhrGkqZ33B3H7mdcsKS6zYzmsDIqa9FzUjtpvN2V
 K/jXlK7dv58Y+LLzpcuJAf8HNnitA107WD5RA1s5nTw0ahD2GwR4UPzEhnSO9/nT
 yZ3dGUysj7Q=
 =dnBv
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

- qemu-storage-daemon: Add vhost-user-blk help
- block-backend: Fix use-after-free for BDS pointers after aio_poll()
- qemu-img: Fix sparseness of output image with unaligned ranges
- vvfat: Fix crashes in read-write mode
- Fix device deletion events with -device JSON syntax
- Code cleanups

# gpg: Signature made Fri 14 Jan 2022 13:50:16 GMT
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  iotests/testrunner.py: refactor test_field_width
  block: drop BLK_PERM_GRAPH_MOD
  qemu-img: make is_allocated_sectors() more efficient
  iotests: Test qemu-img convert of zeroed data cluster
  vvfat: Fix vvfat_write() for writes before the root directory
  vvfat: Fix size of temporary qcow file
  iotests/308: Fix for CAP_DAC_OVERRIDE
  iotests/stream-error-on-reset: New test
  block-backend: prevent dangling BDS pointers across aio_poll()
  qapi/block: Restrict vhost-user-blk to CONFIG_VHOST_USER_BLK_SERVER
  qemu-storage-daemon: Add vhost-user-blk help
  docs: Correct 'vhost-user-blk' spelling
  softmmu: fix device deletion events with -device JSON syntax
  include/sysemu/blockdev.h: remove drive_get_max_devs
  include/sysemu/blockdev.h: remove drive_mark_claimed_by_board and inline drive_def
  block_int: make bdrv_backing_overridden static

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-14 15:56:30 +00:00
Peter Maydell
0b3f07ebf2 - bugfixes for ui, usb, audio, display
- change default display resolution
 - add horizontal scrolling support
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEoDKM/7k6F6eZAf59TLbY7tPocTgFAmHhHcUACgkQTLbY7tPo
 cThE7RAAmWfnog5sqX9jWfXnRmgMSWT1ExBUUlgFFEcwjGnp7OnvXRx5rEN/E8lX
 hxI8769ds7oxwTlRjiOwNfN64xJE1byY3jCRbw2OJI6nDa1ZmrUVQqzOkwBng/SI
 Y6UULkW/fl1ahg73/DW1Oggy97BPGazbo1wi13VM+Xyw8BVceqEyXHdyr7siB34d
 ytZNsQ/InVTrCnL68qoukW9lTb4XRrKdkci22xYTH15Q4r/Bwm55TdWFI8QUbt4T
 G9ec1jsC0aFUniZyaJDH98T9/KIauWzFaGYTxwzHCy0gJoiwSwcnugFcaM3iu0pQ
 Fhq7yGnkzGTXUNspZL2Vf0TF2cPLyJzE0CaM8m6V2U3kcRiGjf6XA7ixQiFxqyoE
 N6XemQNOdwQCAyIJVUZHffMeA1bM352XbO01VpVnBJrdqZ46c5wbVIg73yYTTR96
 CaGGYVya0q7dGw1tf0UG44ipXqx2Lhh6Ml9wRoLpi0wgpqxYM4gTfbw/zYlH+vLg
 jm5XBaWcJoUrBIzFLqXYQds7ZuWEPRsMD3jTZ7krhAqmRODuFLEtvPtWJp4j+bRG
 7v9xmRGBljV6j/yMa/B9d/WiqcZi0LlCP5xJyBQMOAA+MTIQlJfWU7HP/6W3mgn5
 LpqMXAogO1Cdco0GaPcj/j+wrX7LIQb9I1w6eIOFZcahSlRccR4=
 =LOeb
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kraxel/tags/kraxel-20220114-pull-request' into staging

- bugfixes for ui, usb, audio, display
- change default display resolution
- add horizontal scrolling support

# gpg: Signature made Fri 14 Jan 2022 06:52:53 GMT
# gpg:                using RSA key A0328CFFB93A17A79901FE7D4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full]
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>" [full]
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full]
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/kraxel-20220114-pull-request:
  ui/input-legacy: pass horizontal scroll information
  ui/sdl2: pass horizontal scroll information to the device code
  ui/gtk: pass horizontal scroll information to the device code
  ui/cocoa: pass horizontal scroll information to the device code
  ps2: Initial horizontal scroll support
  edid: Added support for 4k@60 Hz monitor
  edid: set default resolution to 1280x800 (WXGA)
  hw/mips/jazz: Inline vga_mmio_init() and remove it
  hw/display/vga-mmio: QOM'ify vga_mmio_init() as TYPE_VGA_MMIO
  hw/display/vga-mmio: Inline vga_mm_init()
  hw/display: Rename VGA_ISA_MM -> VGA_MMIO
  uas: add missing return
  ui: fix gtk clipboard clear assertion
  ui/dbus: fix buffer-overflow detected by ASAN
  hw/audio/intel-hda: fix stream reset
  dsoundaudio: fix crackling audio recordings
  jackaudio: use ifdefs to hide unavailable functions
  ui/vnc.c: Fixed a deadlock bug.
  usb: allow max 8192 bytes for desc
  hw/usb/dev-wacom: add missing HID descriptor

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-14 13:21:41 +00:00
Vladimir Sementsov-Ogievskiy
64631f3681 block: drop BLK_PERM_GRAPH_MOD
First, this permission never protected a node from being changed, as
generic child-replacing functions don't check it.

Second, it's a strange thing: it presents a permission of parent node
to change its child. But generally, children are replaced by different
mechanisms, like jobs or qmp commands, not by nodes.

Graph-mod permission is hard to understand. All other permissions
describe operations which done by parent node on its child: read,
write, resize. Graph modification operations are something completely
different.

The only place where BLK_PERM_GRAPH_MOD is used as "perm" (not shared
perm) is mirror_start_job, for s->target. Still modern code should use
bdrv_freeze_backing_chain() to protect from graph modification, if we
don't do it somewhere it may be considered as a bug. So, it's a bit
risky to drop GRAPH_MOD, and analyzing of possible loss of protection
is hard. But one day we should do it, let's do it now.

One more bit of information is that locking the corresponding byte in
file-posix doesn't make sense at all.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210902093754.2352-1-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-01-14 12:03:16 +01:00
Emanuele Giuseppe Esposito
eac32e2232 include/sysemu/blockdev.h: remove drive_get_max_devs
Remove drive_get_max_devs, as it is not used by anyone.

Last use was removed in commit 8f2d75e81d
("hw: Drop superfluous special checks for orphaned -drive").

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211215121140.456939-4-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-01-14 12:03:16 +01:00
Emanuele Giuseppe Esposito
cc67f28ea2 include/sysemu/blockdev.h: remove drive_mark_claimed_by_board and inline drive_def
drive_def is only a particular use case of
qemu_opts_parse_noisily, so it can be inlined.

Also remove drive_mark_claimed_by_board, as it is only defined
but not implemented (nor used) anywhere.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20211215121140.456939-3-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-01-14 12:03:16 +01:00
Emanuele Giuseppe Esposito
fa8fc1d09f block_int: make bdrv_backing_overridden static
bdrv_backing_overridden is only used in block.c, so there is
no need to leave it in block_int.h

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211215121140.456939-2-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-01-14 12:03:16 +01:00
Peter Maydell
1001c9d9c0 Pull request
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmHfDFIACgkQnKSrs4Gr
 c8hoMwf/QPaU1svRdP9pPiMkJiwmtmgacEKEfrF3I8w8aOOf3dLPyUKafuStJtfZ
 Fhl2631jHL7JKKQKGomJhdzQovHAPsPEC8YFxesB1LvO0LIX4UtYplkxkj27In2D
 9w+cIMVMTkFyIv/5GgTaFBbnmk2at4tqXkcGmcblp0qZCMsElJvGWOkToM+Fjot4
 A4jYUCviqQqdt4j558UjIdecdaWy+5Cnej3NsKwH5V62o2uZY1+7vu0cf0ARcja1
 kptZBbvMIfjyl1TeuJWuEya8aWo0KwIbbs3tVKz16Na7RXlG01mYCwGLAVkBADCD
 mJaM1jZVADtUZyoCkh4M4KBBwFnFCw==
 =ITwP
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha-gitlab/tags/block-pull-request' into staging

Pull request

# gpg: Signature made Wed 12 Jan 2022 17:13:54 GMT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha-gitlab/tags/block-pull-request:
  virtio: unify dataplane and non-dataplane ->handle_output()
  virtio: use ->handle_output() instead of ->handle_aio_output()
  virtio-scsi: prepare virtio_scsi_handle_cmd for dataplane
  virtio-blk: drop unused virtio_blk_handle_vq() return value
  virtio: get rid of VirtIOHandleAIOOutput
  aio-posix: split poll check from ready handler

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-14 10:43:32 +00:00
Daniel P. Berrangé
de72c4b7cd edid: set default resolution to 1280x800 (WXGA)
Currently QEMU defaults to a resolution of 1024x768 when exposing EDID
info to the guest OS. The EDID default info is important as this will
influence what resolution many guest OS will configure the screen with
on boot. It can also potentially influence what resolution the firmware
will configure the screen with, though until very recently EDK2 would
not handle EDID info.

One important thing to bear in mind is that the default graphics card
driver provided by Windows will leave the display set to whatever
resolution was enabled by the firmware on boot. Even if sufficient
VRAM is available, the resolution can't be changed without installing
new drivers. IOW, the default resolution choice is quite important
for usability of Windows.

Modern real world monitor hardware for desktop/laptop has supported
resolutions higher than 1024x768 for a long time now, perhaps as long
as 15+ years. There are quite a wide variety of native resolutions in
use today, however, and in wide screen form factors the height may not
be all that tall.

None the less, it is considered that there is scope for making the
QEMU default resolution slightly larger.

In considering what possible new default could be suitable, choices
considered were 1280x720 (720p), 1280x800 (WXGA) and 1280x1024 (SXGA).

In many ways, vertical space is the most important, and so 720p was
discarded due to loosing vertical space, despite being 25% wider.

The SXGA resolution would be good, but when taking into account
window titlebars/toolbars and window manager desktop UI, this might
be a little too tall for some users to fit the guest on their physical
montior.

This patch thus suggests a modest change to 1280x800 (WXGA). This
only consumes 1 MB per colour channel, allowing double buffered
framebuffer in 8 MB of VRAM. Width wise this is 25% larger than
QEMU's current default, but height wise this only adds 5%, so the
difference isn't massive on the QEMU side.

Overall there doesn't appear to be a compelling reason to stick
with 1024x768 resolution.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20211129140508.1745130-1-berrange@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-01-13 10:59:16 +01:00
Philippe Mathieu-Daudé
7336c94434 hw/mips/jazz: Inline vga_mmio_init() and remove it
vga_mmio_init() is used only one time and not very helpful,
inline and remove it.

Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20211206224528.563588-5-f4bug@amsat.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-01-13 10:58:54 +01:00
Philippe Mathieu-Daudé
23f6e3b11b hw/display/vga-mmio: QOM'ify vga_mmio_init() as TYPE_VGA_MMIO
Introduce TYPE_VGA_MMIO, a sysbus device.

While there is no change in the vga_mmio_init()
interface, this is a migration compatibility break
of the MIPS Acer Pica 61 Jazz machine (pica61).

Suggested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20211206224528.563588-4-f4bug@amsat.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-01-13 10:58:54 +01:00
Philippe Mathieu-Daudé
3ac25236ea hw/display: Rename VGA_ISA_MM -> VGA_MMIO
There is no ISA bus part in the MMIO VGA device, so rename:

 *  hw/display/vga-isa-mm.c -> hw/display/vga-mmio.c
 *  CONFIG_VGA_ISA_MM -> CONFIG_VGA_MMIO
 *  ISAVGAMMState -> VGAMmioState
 *  isa_vga_mm_init() -> vga_mmio_init()

Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20211206224528.563588-2-f4bug@amsat.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-01-13 10:58:54 +01:00
Stefan Hajnoczi
db608fb784 virtio: unify dataplane and non-dataplane ->handle_output()
Now that virtio-blk and virtio-scsi are ready, get rid of
the handle_aio_output() callback. It's no longer needed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-7-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12 17:09:39 +00:00
Stefan Hajnoczi
186b969173 virtio-blk: drop unused virtio_blk_handle_vq() return value
The return value of virtio_blk_handle_vq() is no longer used. Get rid of
it. This is a step towards unifying the dataplane and non-dataplane
virtqueue handler functions.

Prepare virtio_blk_handle_output() to be used by both dataplane and
non-dataplane by making the condition for starting ioeventfd more
specific. This way it won't trigger when dataplane has already been
started.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12 17:09:39 +00:00
Stefan Hajnoczi
d93d16c045 virtio: get rid of VirtIOHandleAIOOutput
The virtqueue host notifier API
virtio_queue_aio_set_host_notifier_handler() polls the virtqueue for new
buffers. AioContext previously required a bool progress return value
indicating whether an event was handled or not. This is no longer
necessary because the AioContext polling API has been split into a poll
check function and an event handler function. The event handler is only
run when we know there is work to do, so it doesn't return bool.

The VirtIOHandleAIOOutput function signature is now the same as
VirtIOHandleOutput. Get rid of the bool return value.

Further simplifications will be made for virtio-blk and virtio-scsi in
the next patch.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12 17:09:39 +00:00
Stefan Hajnoczi
826cc32423 aio-posix: split poll check from ready handler
Adaptive polling measures the execution time of the polling check plus
handlers called when a polled event becomes ready. Handlers can take a
significant amount of time, making it look like polling was running for
a long time when in fact the event handler was running for a long time.

For example, on Linux the io_submit(2) syscall invoked when a virtio-blk
device's virtqueue becomes ready can take 10s of microseconds. This
can exceed the default polling interval (32 microseconds) and cause
adaptive polling to stop polling.

By excluding the handler's execution time from the polling check we make
the adaptive polling calculation more accurate. As a result, the event
loop now stays in polling mode where previously it would have fallen
back to file descriptor monitoring.

The following data was collected with virtio-blk num-queues=2
event_idx=off using an IOThread. Before:

168k IOPS, IOThread syscalls:

  9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0)    = 16
  9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8)                         = 8
  9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8)                         = 8
  9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3
  9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512)                        = 8
  9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512)                        = 8
  9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512)                        = 8
  9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0)    = 32

174k IOPS (+3.6%), IOThread syscalls:

  9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0)    = 32
  9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8)                         = 8
  9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8)                         = 8
  9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50)    = 32

Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because
the IOThread stays in polling mode instead of falling back to file
descriptor monitoring.

As usual, polling is not implemented on Windows so this patch ignores
the new io_poll_read() callback in aio-win32.c.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-2-stefanha@redhat.com

[Fixed up aio_set_event_notifier() calls in
tests/unit/test-fdmon-epoll.c added after this series was queued.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12 17:09:39 +00:00
Daniel Henrique Barboza
7e1e0912ec ppc/pnv: turn pnv_phb4_update_regions() into static
Its only callers are inside pnv_phb4.c.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220111131027.599784-6-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12 11:28:27 +01:00
Daniel Henrique Barboza
dc8e2914ab ppc/pnv: turn 'phb' into a pointer in struct PnvPhb4PecStack
At this moment, stack->phb is the plain PnvPHB4 device itself instead of
a pointer to the device. This will present a problem when adding user
creatable devices because we can't deal with this struct and the
realize() callback from the user creatable device.

We can't get rid of this attribute, similar to what we did when enabling
pnv-phb3 user creatable devices, because pnv_phb4_update_regions() needs
to access stack->phb to do its job. This function is called twice in
pnv_pec_stk_update_map(), which is one of the nested xscom write
callbacks (via pnv_pec_stk_nest_xscom_write()). In fact,
pnv_pec_stk_update_map() code comment is explicit about how the order of
the unmap/map operations relates with the PHB subregions.

All of this indicates that this code is tied together in a way that we
either go on a crusade, featuring lots of refactories and redesign and
considerable pain, to decouple stack and phb mapping, or we allow stack
update_map operations to access the associated PHB as it is today even
after introducing pnv-phb4 user devices.

This patch chooses the latter. Instead of getting rid of stack->phb,
turn it into a PHB pointer. This will allow us to assign an user created
PHB to an existing stack later. In this process,
pnv_pec_stk_instance_init() is removed because stack->phb is being
initialized in stk_realize() instead.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220111131027.599784-4-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12 11:28:27 +01:00
Daniel Henrique Barboza
5032f5d705 pnv_phb4_pec.c: move pnv_pec_phb_offset() to pnv_phb4.c
The logic inside pnv_pec_phb_offset() will be useful in the next patch
to determine the stack that should contain a PHB4 device.

Move the function to pnv_phb4.c and make it public since there's no
pnv_phb4_pec.h header. While we're at it, add 'stack_index' as a
parameter and make the function return 'phb-id' directly. And rename it
to pnv_phb4_pec_get_phb_id() to be even clearer about the function
intent.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220110143346.455901-3-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12 11:28:27 +01:00
Daniel Henrique Barboza
451575816c pnv_phb4.c: change TYPE_PNV_PHB4_ROOT_BUS name
Similar to what was happening with pnv-phb3 buses,
TYPE_PNV_PHB4_ROOT_BUS set to "pnv-phb4-root-bus" is a bit too long for
a default root bus name. The usual default name for theses buses in QEMU
are 'pcie', but we want to make a distinction between pnv-phb4 buses and
other PCIE buses, at least as far as default name goes, because not all
PCIE devices are attachable to a pnv-phb4 root-bus type.

Changing the default to 'pnv-phb4-root' allow us to have a shorter name
while making this bus distinct, and the user can always set its own bus
naming via the "id" attribute anyway.

This is the 'info qtree' output after this change, using a powernv9
domain with 2 sockets and default settings enabled:

qemu-system-ppc64 -m 4G -machine powernv9,accel=tcg \
     -smp 2,sockets=2,cores=1,threads=1

  dev: pnv-phb4, id ""
    index = 5 (0x5)
    chip-id = 1 (0x1)
    version = 704374636546 (0xa400000002)
    device-id = 1217 (0x4c1)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pnv-phb4-root.11
      type pnv-phb4-root
      dev: pnv-phb4-root-port, id ""
(...)
  dev: pnv-phb4, id ""
    index = 0 (0x0)
    chip-id = 1 (0x1)
    version = 704374636546 (0xa400000002)
    device-id = 1217 (0x4c1)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pnv-phb4-root.6
      type pnv-phb4-root
      dev: pnv-phb4-root-port, id ""
(..)
  dev: pnv-phb4, id ""
    index = 5 (0x5)
    chip-id = 0 (0x0)
    version = 704374636546 (0xa400000002)
    device-id = 1217 (0x4c1)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pnv-phb4-root.5
      type pnv-phb4-root
      dev: pnv-phb4-root-port, id ""
(...)
  dev: pnv-phb4, id ""
    index = 0 (0x0)
    chip-id = 0 (0x0)
    version = 704374636546 (0xa400000002)
    device-id = 1217 (0x4c1)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pnv-phb4-root.0
      type pnv-phb4-root
      dev: pnv-phb4-root-port, id ""

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220110143346.455901-11-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12 11:28:27 +01:00
Daniel Henrique Barboza
41cb8d319d pnv_phb3.h: change TYPE_PNV_PHB3_ROOT_BUS name
The TYPE_PNV_PHB3_ROOT_BUS name is used as the default bus name when
the dev has no 'id'. However, pnv-phb3-root-bus is a bit too long to be
used as a bus name.

Most common QEMU buses and PCI controllers are named based on their bus
type (e.g. pSeries spapr-pci-host-bridge is called 'pci'). The most
common name for a PCIE bus controller in QEMU is 'pcie'. Naming it
'pcie' would break the documented use of the pnv-phb3 device, since
'pcie.0' would now refer to the root bus instead of the first root port.

There's nothing particularly wrong with the 'root-bus' name used before,
aside from the fact that 'root-bus' is being used for pnv-phb3 and
pnv-phb4 created buses, which is not quite correct since these buses
aren't implemented the same way in QEMU - you can't plug a
pnv-phb4-root-port into a pnv-phb3 root bus, for example.

This patch renames it as 'pnv-phb3-root', which is a compromise between
the existing and the previously used name. Creating 3 phbs without ID
will result in an "info qtree" output similar to this:

bus: main-system-bus
  type System
  dev: pnv-phb3, id ""
    index = 2 (0x2)
    chip-id = 0 (0x0)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pnv-phb3-root.2
      type pnv-phb3-root
(...)
  dev: pnv-phb3, id ""
    index = 1 (0x1)
    chip-id = 0 (0x0)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pnv-phb3-root.1
      type pnv-phb3-root
(...)
  dev: pnv-phb3, id ""
    index = 0 (0x0)
    chip-id = 0 (0x0)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pnv-phb3-root.0
      type pnv-phb3-root

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220105212338.49899-11-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12 11:28:27 +01:00
Cédric Le Goater
eb93c82888 ppc/pnv: Move num_phbs under Pnv8Chip
It is not used elsewhere so that's where it belongs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20220105212338.49899-10-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12 11:28:27 +01:00
Cédric Le Goater
c29dd0034d ppc/pnv: Reparent user created PHB3 devices to the PnvChip
The powernv machine uses the object hierarchy to populate the device
tree and each device should be parented to the chip it belongs to.
This is not the case for user created devices which are parented to
the container "/unattached".

Make sure a PHB3 device is parented to its chip by reparenting the
object if necessary.

Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20220105212338.49899-8-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12 11:28:27 +01:00
Cédric Le Goater
1f6a88fffc ppc/pnv: Introduce support for user created PHB3 devices
PHB3 devices and PCI devices can now be added to the powernv8 machine
using :

  -device pnv-phb3,chip-id=0,index=1 \
  -device nec-usb-xhci,bus=pci.1,addr=0x0

The 'index' property identifies the PHB3 in the chip. In case of user
created devices, a lookup on 'chip-id' is required to assign the
owning chip.

Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20220105212338.49899-7-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12 11:28:27 +01:00
Daniel Henrique Barboza
1360fd832b pnv_phb4.c: make pnv-phb4-root-port user creatable
We want to create only the absolutely minimal amount of devices when
running with -nodefaults. The root port is something that the machine
can boot up without. But, to do that, we need to provide a way for the
user to add them by hand.

This patch makes pnv-phb4-root-port user creatable and then uses the
pnv_phb_attach_root_port() helper to add a pnv_phb4_root_port only when
running with default settings.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220105212338.49899-5-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12 11:28:27 +01:00
Cédric Le Goater
a71cd51e2a ppc/pnv: Attach PHB3 root port device when defaults are enabled
This cleanups the PHB3 model a bit more since the root port is an
independent device and it will ease our task when adding user created
PHB3s.

pnv_phb_attach_root_port() is made public in pnv.c so it can be reused
with the pnv_phb4 root port later.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220105212338.49899-4-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-12 11:28:27 +01:00
Peter Maydell
bf99e0ec9a virtio: revert config interrupt changes
Lots of fallout from config interrupt changes. Author wants to rework
 the patches. Let's revert quickly so others don't suffer meanwhile.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmHcnzAPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpPQEH/2Kwx1xc8P3U4UULco1VlQ//mC0l0QBJTVpt
 qrt0HFEmXvO8G//ybWeqiwGO88aZ4oaskQN4JxnqJ+QmRp8VA7XDM1QyH6lNdIzt
 tT0xay5QXn/fOZIPwzRYJMqnvrei2mkeIIT60E9BBqVL/c+r3bHGkzmE1sFBSE14
 k/el3le/FJ7eaxU8WnddoIxjKmc9R6xpno96TRiAphdsI7OizHvaMYJ4swE+yQ21
 UHoZkkrJxE3RV7t99CQXHAA2FZIjVtPOegro0t+7a1/EqxRtKkuUJIJSpFWThbWf
 I95BGx8m8g+sDqUYSf6wLR57PQLcOUC1aQqP5N1bptAmyKgR+1I=
 =yQ3c
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio: revert config interrupt changes

Lots of fallout from config interrupt changes. Author wants to rework
the patches. Let's revert quickly so others don't suffer meanwhile.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 10 Jan 2022 21:03:44 GMT
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  Revert "virtio: introduce macro IRTIO_CONFIG_IRQ_IDX"
  Revert "virtio-pci: decouple notifier from interrupt process"
  Revert "virtio-pci: decouple the single vector from the interrupt process"
  Revert "vhost: introduce new VhostOps vhost_set_config_call"
  Revert "vhost-vdpa: add support for config interrupt"
  Revert "virtio: add support for configure interrupt"
  Revert "vhost: add support for configure interrupt"
  Revert "virtio-net: add support for configure interrupt"
  Revert "virtio-mmio: add support for configure interrupt"
  Revert "virtio-pci: add support for configure interrupt"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-11 10:12:29 +00:00
Michael S. Tsirkin
a882b57123 Revert "virtio: introduce macro IRTIO_CONFIG_IRQ_IDX"
This reverts commit bf1d85c166.

Fixes: bf1d85c166 ("virtio: introduce macro IRTIO_CONFIG_IRQ_IDX")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:02:54 -05:00
Michael S. Tsirkin
98b34e030e Revert "vhost: introduce new VhostOps vhost_set_config_call"
This reverts commit 8806237234.

Fixes: 8806237234 ("vhost: introduce new VhostOps vhost_set_config_call")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:02:01 -05:00
Michael S. Tsirkin
81c3ebc32f Revert "virtio: add support for configure interrupt"
This reverts commit 081f864f56.

Fixes: 081f864f56 ("virtio: add support for configure interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:01:28 -05:00
Michael S. Tsirkin
a86d1a0a93 Revert "vhost: add support for configure interrupt"
This reverts commit f7220a7ce2.

Fixes: f7220a7ce2 ("vhost: add support for configure interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:01:11 -05:00
Michael S. Tsirkin
b3ef6664b7 Revert "virtio-net: add support for configure interrupt"
This reverts commit 497679d510.

Fixes: 497679d510 ("virtio-net: add support for configure interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:00:54 -05:00
Frédéric Pétrot
332dab6878 target/riscv: setup everything for rv64 to support rv128 execution
This patch adds the support of the '-cpu rv128' option to
qemu-system-riscv64 so that we can indicate that we want to run rv128
executables.
Still, there is no support for 128-bit insns at that stage so qemu fails
miserably (as expected) if launched with this option.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20220106210108.138226-8-frederic.petrot@univ-grenoble-alpes.fr
[ Changed by AF
 - Rename CPU to "x-rv128"
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-01-08 15:46:10 +10:00
Frédéric Pétrot
e9d07601f6 qemu/int128: addition of div/rem 128-bit operations
Addition of div and rem on 128-bit integers, using the 128/64->128 divu and
64x64->128 mulu in host-utils.
These operations will be used within div/rem helpers in the 128-bit riscv
target.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20220106210108.138226-4-frederic.petrot@univ-grenoble-alpes.fr
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-01-08 15:46:10 +10:00
Frédéric Pétrot
c7f9dd5465 exec/memop: Adding signed quad and octo defines
Adding defines to handle signed 64-bit and unsigned 128-bit quantities in
memory accesses.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20220106210108.138226-3-frederic.petrot@univ-grenoble-alpes.fr
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-01-08 15:46:10 +10:00
Frédéric Pétrot
fc313c6434 exec/memop: Adding signedness to quad definitions
Renaming defines for quad in their various forms so that their signedness is
now explicit.
Done using git grep as suggested by Philippe, with a bit of hand edition to
keep assignments aligned.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20220106210108.138226-2-frederic.petrot@univ-grenoble-alpes.fr
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-01-08 15:46:10 +10:00
Alistair Francis
d4452c6924 hw/riscv: virt: Allow support for 32 cores
Linux supports up to 32 cores for both 32-bit and 64-bit RISC-V, so
let's set that as the maximum for the virt board.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/435
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20220105213937.1113508-9-alistair.francis@opensource.wdc.com>
2022-01-08 15:46:09 +10:00
Richard Henderson
d70075373a virtio,pci,pc: features,fixes,cleanups
New virtio mem options.
 A vhost-user cleanup.
 Control over smbios entry point type.
 Config interrupt support for vdpa.
 Fixes, cleanups all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmHY2zEPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpCiEH/jv5tHUffDdGz5M2pN7FTWPQ9UAMQZXbn5AS
 PPVutOI/B+ILYBuNjYLvMGeq6ymG4/0DM940/jkQwCWD4ku1OG0ReM5T5klUR8lY
 df5y1SCDv3Yoq0vxpQCnssKqbgm8Kf9tnAFjni7Lvbu3oo6DCq77m6MWEapLoEUu
 IkM+l60NKmHAClnE6RF4KobLa5srIlDTho1iBXH5S39CRF1LvP9NgnYzl7nqiEkq
 ZYQEqkKO5XGxZji9banZPJD2kxt1iL7s24QI6OJG2Lz8Hf86b0Yo7XJpmw4ShP9h
 Vl1SL3m/HhHSMBuXOb7w/EkCm59b7whXCmoyYBF/GqaxtZkvVnM=
 =4VIN
 -----END PGP SIGNATURE-----

Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pci,pc: features,fixes,cleanups

New virtio mem options.
A vhost-user cleanup.
Control over smbios entry point type.
Config interrupt support for vdpa.
Fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 07 Jan 2022 04:30:41 PM PST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (55 commits)
  tests: acpi: Add updated TPM related tables
  acpi: tpm: Add missing device identification objects
  tests: acpi: prepare for updated TPM related tables
  virtio/vhost-vsock: don't double close vhostfd, remove redundant cleanup
  hw/scsi/vhost-scsi: don't double close vhostfd on error
  hw/scsi/vhost-scsi: don't leak vqs on error
  docs: reSTify virtio-balloon-stats documentation and move to docs/interop
  hw/i386/pc: Add missing property descriptions
  acpihp: simplify acpi_pcihp_disable_root_bus
  tests: acpi: SLIC: update expected blobs
  tests: acpi: add SLIC table test
  tests: acpi: whitelist expected blobs before changing them
  acpi: fix QEMU crash when started with SLIC table
  intel-iommu: correctly check passthrough during translation
  virtio-mem: Set "unplugged-inaccessible=auto" for the 7.0 machine on x86
  virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
  linux-headers: sync VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
  MAINTAINERS: Add a separate entry for acpi/VIOT tables
  virtio: signal after wrapping packed used_idx
  virtio-mem: Support "prealloc=on" option
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:24:24 -08:00
David Hildenbrand
23ad8dec8d virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
With VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we signal the VM that reading
unplugged memory is not supported. We have to fail feature negotiation
in case the guest does not support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.

First, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE is required to properly handle
memory backends (or architectures) without support for the shared zeropage
in the hypervisor cleanly. Without the shared zeropage, even reading an
unpopulated virtual memory location can populate real memory and
consequently consume memory in the hypervisor. We have a guaranteed shared
zeropage only on MAP_PRIVATE anonymous memory.

Second, we want VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE to be the default
long-term as even populating the shared zeropage can be problematic: for
example, without THP support (possible) or without support for the shared
huge zeropage with THP (unlikely), the PTE page tables to hold the shared
zeropage entries can consume quite some memory that cannot be reclaimed
easily.

Third, there are other optimizations+features (e.g., protection of
unplugged memory, reducing the total memory slot size and bitmap sizes)
that will require VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.

We really only support x86 targets with virtio-mem for now (and
Linux similarly only support x86), but that might change soon, so prepare
for different targets already.

Add a new "unplugged-inaccessible" tristate property for x86 targets:
- "off" will keep VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE unset and legacy
  guests working.
- "on" will set VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE and stop legacy guests
  from using the device.
- "auto" selects the default based on support for the shared zeropage.

Warn in case the property is set to "off" and we don't have support for the
shared zeropage.

For existing compat machines, the property will default to "off", to
not change the behavior but eventually warn about a problematic setup.
Short-term, we'll set the property default to "auto" for new QEMU machines.
Mid-term, we'll set the property default to "on" for new QEMU machines.
Long-term, we'll deprecate the parameter and disallow legacy
guests completely.

The property has to match on the migration source and destination. "auto"
will result in the same VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE setting as long
as the qemu command line (esp. memdev) match -- so "auto" is good enough
for migration purposes and the parameter doesn't have to be migrated
explicitly.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134039.29670-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
David Hildenbrand
3ff9b192de linux-headers: sync VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
Let's synchronize the new feature flag, available in Linux since
v5.16-rc1.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134039.29670-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
David Hildenbrand
09b3b7e092 virtio-mem: Support "prealloc=on" option
For scarce memory resources, such as hugetlb, we want to be able to
prealloc such memory resources in order to not crash later on access. On
simple user errors we could otherwise easily run out of memory resources
an crash the VM -- pretty much undesired.

For ordinary memory devices, such as DIMMs, we preallocate memory via the
memory backend for such use cases; however, with virtio-mem we're dealing
with sparse memory backends; preallocating the whole memory backend
destroys the whole purpose of virtio-mem.

Instead, we want to preallocate memory when actually exposing memory to the
VM dynamically, and fail plugging memory gracefully + warn the user in case
preallocation fails.

A common use case for hugetlb will be using "reserve=off,prealloc=off" for
the memory backend and "prealloc=on" for the virtio-mem device. This
way, no huge pages will be reserved for the process, but we can recover
if there are no actual huge pages when plugging memory. Libvirt is
already prepared for this.

Note that preallocation cannot protect from the OOM killer -- which
holds true for any kind of preallocation in QEMU. It's primarily useful
only for scarce memory resources such as hugetlb, or shared file-backed
memory. It's of little use for ordinary anonymous memory that can be
swapped, KSM merged, ... but we won't forbid it.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-9-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
Peter Maydell
80dcd37feb hw/intc/arm_gicv3_its: Fix various off-by-one errors
The ITS code has to check whether various parameters passed in
commands are in-bounds, where the limit is defined in terms of the
number of bits that are available for the parameter.  (For example,
the GITS_TYPER.Devbits ID register field specifies the number of
DeviceID bits minus 1, and device IDs passed in the MAPTI and MAPD
command packets must fit in that many bits.)

Currently we have off-by-one bugs in many of these bounds checks.
The typical problem is that we define a max_foo as 1 << n. In
the Devbits example, we set
  s->dt.max_ids = 1UL << (GITS_TYPER.Devbits + 1).
However later when we do the bounds check we write
  if (devid > s->dt.max_ids) { /* command error */ }
which incorrectly permits a devid of 1 << n.

These bugs will not cause QEMU crashes because the ID values being
checked are only used for accesses into tables held in guest memory
which we access with address_space_*() functions, but they are
incorrect behaviour of our emulation.

Fix them by standardizing on this pattern:
 * bounds limits are named num_foos and are the 2^n value
   (equal to the number of valid foo values)
 * bounds checks are either
   if (fooid < num_foos) { good }
   or
   if (fooid >= num_foos) { bad }

In this commit we fix the handling of the number of IDs
in the device table and the collection table, and the number
of commands that will fit in the command queue.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2022-01-07 17:08:00 +00:00
Peter Maydell
6c1db43de4 hw/intc/arm_gicv3_its: Remove maxids union from TableDesc
The TableDesc struct defines properties of the in-guest-memory tables
which the guest tells us about by writing to the GITS_BASER<n>
registers.  This struct currently has a union 'maxids', but all the
fields of the union have the same type (uint32_t) and do the same
thing (record one-greater-than the maximum ID value that can be used
as an index into the table).

We're about to add another table type (the GICv4 vPE table); rather
than adding another specifically-named union field for that table
type with the same type as the other union fields, remove the union
entirely and just have a 'uint32_t max_ids' struct field.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:07:58 +00:00
Troy Lee
d9e9cd59df Add dummy Aspeed AST2600 Display Port MCU (DPMCU)
AST2600 Display Port MCU introduces 0x18000000~0x1803FFFF as it's memory
and io address. If guest machine try to access DPMCU memory, it will
cause a fatal error.

Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20211210083034.726610-1-troy_lee@aspeedtech.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-07 17:07:57 +00:00
David Hildenbrand
a384bfa32e util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc()
Let's sense support and use it for preallocation. MADV_POPULATE_WRITE
does not require a SIGBUS handler, doesn't actually touch page content,
and avoids context switches; it is, therefore, faster and easier to handle
than our current approach.

While MADV_POPULATE_WRITE is, in general, faster than manual
prefaulting, and especially faster with 4k pages, there is still value in
prefaulting using multiple threads to speed up preallocation.

More details on MADV_POPULATE_WRITE can be found in the Linux commits
4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault
page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for
MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1].

This resolves the TODO in do_touch_pages().

In the future, we might want to look into using fallocate(), eventually
combined with MADV_POPULATE_READ, when dealing with shared file/fd
mappings and not caring about memory bindings.

[1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com

Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Eduardo Habkost
0e4edb3b3b hw/i386: expose a "smbios-entry-point-type" PC machine property
The i440fx and Q35 machine types are both hardcoded to use the
legacy SMBIOS 2.1 (32-bit) entry point. This is a sensible
conservative choice because SeaBIOS only supports SMBIOS 2.1

EDK2, however, can also support SMBIOS 3.0 (64-bit) entry points,
and QEMU already uses this on the ARM virt machine type.

This adds a property to allow the choice of SMBIOS entry point
versions For example to opt in to 64-bit SMBIOS entry point:

   $QEMU -machine q35,smbios-entry-point-type=64

Based on a patch submitted by Daniel Berrangé.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211026151100.1691925-4-ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2022-01-07 05:19:55 -05:00
Eduardo Habkost
bdf54a9a7b hw/smbios: Use qapi for SmbiosEntryPointType
This prepares for exposing the SMBIOS entry point type as a
machine property on x86.

Based on a patch from Daniel P. Berrangé.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211026151100.1691925-3-ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
2022-01-07 05:19:55 -05:00