33229 Commits

Author SHA1 Message Date
Igor Mammedov
f40e6a4cc1 pcihp: piix4: do not redirect hotplug controller to piix4 when ACPI hotplug is disabled
commit [1] added ability to disable ACPI PCI hotplug
on hostbridge but forgot to take into account that it
should disable all ACPI hotplug machinery in case both
hostbridge and bridge hotplug are disabled.

Commit [2] tried to fix that, however it forgot to
remove hotplug_handler override which hands hotplug
control over to piix4 hotplug controller
(uninitialized after [2]).

As result at the time bridge is plugged in, its default
(SHPC) hotplug handler is replaced by piix4 one in
  acpi_pcihp_device_plug_cb()
    ...
    if (!s->legacy_piix &&
       ...
       qbus_set_hotplug_handler(BUS(sec), OBJECT(hotplug_dev));

which is acting on uninitialized s->legacy_piix value
(0 by default) that was supposed to be initialized by
acpi_pcihp_init(), that is no longer called due to
following condition being false:

  piix4_acpi_system_hot_add_init()
    if (s->use_acpi_hotplug_bridge || s->use_acpi_root_pci_hotplug) {

and the bridge ends up with piix4 as hotplug handler
instead of shpc one.

Followup hotplug on that bridge as result yields
piix4 specific error:

  Error: Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set

1) 3d7e78aa777 (Introduce a new flag for i440fx to disable PCI hotplug on the root bus)
2) df4008c9c59 (piix4: don't reserve hw resources when hotplug is off globally)

Fixes: df4008c9c59 (piix4: don't reserve hw resources when hotplug is off globally)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-12-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
0e84fd3b98 x86: pcihp: fix missing bridge AML when intermediate root-port has 'hotplug=off' set
(I practice [1] hasn't broke anything since on hardware side we unset
hotplug_handler on such intermediate port => hotplug behind it has
never worked)

When deciding if bridge should be described, the original
condition was

  cold_plugged_bridge && pcihp_bridge_en

which was replaced [1] by

  bridge has ACPI_PCIHP_PROP_BSEL

the later however is not the same thing as the original
and flips to false if intermediate bridge has hotplug
turned off (root-port with 'hotplug=off' option).

Since we already in build_pci_bridge_aml(), the question
if it's bridge is answered. Use DeviceState::hotplugged
to make decision if bridge should describe its slots.

What's left out is pcihp_bridge_en, which tells us if
ACPI bridge hotplug is enabled.

With hotplug and non hotplug part now being mostly
separated, omitting this check will only lead to
colplugged bridges describe occupied slots in case
when ACPI bridge hotplug is disabled.
Which makes behavior consistent with occupied slots
on hostbridge.

Ex (pc/DSDT.hpbrroot diff):
  ...
               Device (S20)
               {
                   Name (_ADR, 0x00040000)  // _ADR: Address
  +                Device (S08)
  +                {
  +                    Name (_ADR, 0x00010000)  // _ADR: Address
  +                }
  +
  +                Device (S10)
  +                {
  +                    Name (_ADR, 0x00020000)  // _ADR: Address
  +                }
               }
  ...

PS:
testing shows that above doesn't affect adversely guest OS
behavior: i.e. if ACPI bridge hotplug is enabled it's
expected behaviour, and with ACPI bridge hotplug is disabled
(a.k. native hotplug), it doesn't break slot enumeration
nor native hotplug. (tested with RHEL9.0 and WS2022).

1)
Fixes: 6c36ec46b0d ("pcihp: make bridge describe itself using AcpiDevAmlIfClass:build_dev_aml")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-10-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Igor Mammedov
11215a349e x86: pcihp: fix missing PCNT callchain when intermediate root-port has 'hotplug=off' set
Beside BSEL numbers change (due to 2 extra root-ports in q35/miltibridge test),
following change is expected:

       Scope (\_SB.PCI0)
       {
  ...
  +        Scope (S50)
  +        {
  +            Scope (S00)
  +            {
  +                Method (PCNT, 0, NotSerialized)
  +                {
  +                    BNUM = Zero
  +                    DVNT (PCIU, One)
  +                    DVNT (PCID, 0x03)
  +                }
  +            }
  +
  +            Method (PCNT, 0, NotSerialized)
  +            {
  +                ^S00.PCNT
  +            }
  +        }
  ...
           Method (PCNT, 0, NotSerialized)
           {
  +            ^S50.PCNT ()
               ^S13.PCNT ()
               ^S12.PCNT ()
               ^S11.PCNT ()

I practice [1] hasn't broke anything since on hardware side we unset
hotplug_handler on such intermediate port => hotplug behind it has
not been properly wired and as result not worked.

1)
Fixes: ddab4d3fae4e8 ("pcihp: compose PCNT callchain right before its user _GPE._E01")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-8-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
ab7337e3b2 vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices
vhost-vdpa devices can return this feature now that blockers have been
set in case some features are not met.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-15-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
57ac831865 vdpa: block migration if SVQ does not admit a feature
Next patches enable devices to be migrated even if vdpa netdev has not
been started with x-svq. However, not all devices are migratable, so we
need to block migration if we detect that.

Block migration if we detect the device expose a feature SVQ does not
know how to work with.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-13-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
9c363cf6d5 vdpa net: block migration if the device has CVQ
Devices with CVQ need to migrate state beyond vq state.  Leaving this to
future series.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-11-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
a230c4712b vdpa: disable RAM block discard only for the first device
Although it does not make a big difference, its more correct and
simplifies the cleanup path in subsequent patches.

Move ram_block_discard_disable(false) call to the top of
vhost_vdpa_cleanup because:
* We cannot use vhost_vdpa_first_dev after dev->opaque = NULL
  assignment.
* Improve the stack order in cleanup: since it is the last action taken
  in init, it should be the first at cleanup.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-10-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
c3716f260b vdpa: move vhost reset after get vring base
The function vhost.c:vhost_dev_stop calls vhost operation
vhost_dev_start(false). In the case of vdpa it totally reset and wipes
the device, making the fetching of the vring base (virtqueue state) totally
useless.

The kernel backend does not use vhost_dev_start vhost op callback, but
vhost-user do. A patch to make vhost_user_dev_start more similar to vdpa
is desirable, but it can be added on top.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-8-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
0bb302a996 vdpa: add vhost_vdpa_suspend
The function vhost.c:vhost_dev_stop fetches the vring base so the vq
state can be migrated to other devices.  However, this is unreliable in
vdpa, since we didn't signal the device to suspend the queues, making
the value fetched useless.

Suspend the device if possible before fetching first and subsequent
vring bases.

Moreover, vdpa totally reset and wipes the device at the last device
before fetch its vrings base, making that operation useless in the last
device. This will be fixed in later patches of this series.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-7-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
b6662cb7e5 vdpa: add vhost_vdpa->suspended parameter
This allows vhost_vdpa to track if it is safe to get the vring base from
the device or not.  If it is not, vhost can fall back to fetch idx from
the guest buffer again.

No functional change intended in this patch, later patches will use this
field.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-6-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
4241e8bd72 vdpa: rewind at get_base, not set_base
At this moment it is only possible to migrate to a vdpa device running
with x-svq=on. As a protective measure, the rewind of the inflight
descriptors was done at the destination. That way if the source sent a
virtqueue with inuse descriptors they are always discarded.

Since this series allows to migrate also to passthrough devices with no
SVQ, the right thing to do is to rewind at the source so the base of
vrings are correct.

Support for inflight descriptors may be added in the future.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-5-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
d83b494580 vdpa: Negotiate _F_SUSPEND feature
This is needed for qemu to know it can suspend the device to retrieve
its status and enable SVQ with it, so all the process is transparent to
the guest.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-4-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez
b276524386 vdpa: Remember last call fd set
As SVQ can be enabled dynamically at any time, it needs to store call fd
always.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-3-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
2cb0692768 cryptodev: Use CryptoDevBackendOpInfo for operation
Move queue_index, CryptoDevCompletionFunc and opaque into struct
CryptoDevBackendOpInfo, then cryptodev_backend_crypto_operation()
needs an argument CryptoDevBackendOpInfo *op_info only. And remove
VirtIOCryptoReq from cryptodev. It's also possible to hide
VirtIOCryptoReq into virtio-crypto.c in the next step. (In theory,
VirtIOCryptoReq is a private structure used by virtio-crypto only)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-9-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
bc304a6442 cryptodev: Introduce server type in QAPI
Introduce cryptodev service type in cryptodev.json, then apply this
to related codes. Now we can remove VIRTIO_CRYPTO_SERVICE_xxx
dependence from QEMU cryptodev.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-5-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
zhenwei pi
999c789f00 cryptodev: Introduce cryptodev alg type in QAPI
Introduce cryptodev alg type in cryptodev.json, then apply this to
related codes, and drop 'enum CryptoDevBackendAlgType'.

There are two options:
1, { 'enum': 'QCryptodevBackendAlgType',
  'prefix': 'CRYPTODEV_BACKEND_ALG',
  'data': ['sym', 'asym']}
Then we can keep 'CRYPTODEV_BACKEND_ALG_SYM' and avoid lots of
changes.
2, changes in this patch(with prefix 'QCRYPTODEV_BACKEND_ALG').

To avoid breaking the rule of QAPI, use 2 here.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230301105847.253084-4-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Joao Martins
95b29658b6 vfio/migration: Query device dirty page tracking support
Now that everything has been set up for device dirty page tracking,
query the device for device dirty page tracking support.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-15-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00
Joao Martins
e46883204c vfio/migration: Block migration with vIOMMU
Migrating with vIOMMU will require either tracking maximum
IOMMU supported address space (e.g. 39/48 address width on Intel)
or range-track current mappings and dirty track the new ones
post starting dirty tracking. This will be done as a separate
series, so add a live migration blocker until that is fixed.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-14-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00
Joao Martins
b153402a89 vfio/common: Add device dirty page bitmap sync
Add device dirty page bitmap sync functionality. This uses the device
DMA logging uAPI to sync dirty page bitmap from the device.

Device dirty page bitmap sync is used only if all devices within a
container support device dirty page tracking.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-13-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00
Avihai Horon
6607109f05 vfio/common: Extract code from vfio_get_dirty_bitmap() to new function
Extract the VFIO_IOMMU_DIRTY_PAGES ioctl code in vfio_get_dirty_bitmap()
to its own function.

This will help the code to be more readable after next patch will add
device dirty page bitmap sync functionality.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-12-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:22 -07:00
Joao Martins
5255bbf4ec vfio/common: Add device dirty page tracking start/stop
Add device dirty page tracking start/stop functionality. This uses the
device DMA logging uAPI to start and stop dirty page tracking by device.

Device dirty page tracking is used only if all devices within a
container support device dirty page tracking.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Link: https://lore.kernel.org/r/20230307125450.62409-11-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07 10:21:11 -07:00
Alex Bennée
548c96095d includes: move tb_flush into its own header
This aids subsystems (like gdbstub) that want to trigger a flush
without pulling target specific headers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Message-Id: <20230302190846.2593720-8-alex.bennee@linaro.org>
Message-Id: <20230303025805.625589-8-richard.henderson@linaro.org>
2023-03-07 17:06:33 +00:00
David Woodhouse
a78c54c4f9 i386/xen: Initialize Xen backends from pc_basic_device_init() for emulation
Now that all the work is done to enable the PV backends to work without
actual Xen, instantiate the bus from pc_basic_device_init() for emulated
mode.

This allows us finally to launch an emulated Xen guest with PV disk.

   qemu-system-x86_64 -serial mon:stdio -M q35 -cpu host -display none \
     -m 1G -smp 2 -accel kvm,xen-version=0x4000a,kernel-irqchip=split \
     -kernel bzImage -append "console=ttyS0 root=/dev/xvda1" \
     -drive file=/var/lib/libvirt/images/fedora28.qcow2,if=none,id=disk \
     -device xen-disk,drive=disk,vdev=xvda

If we use -M pc instead of q35, we can even add an IDE disk and boot a
guest image normally through grub. But q35 gives us AHCI and that isn't
unplugged by the Xen magic, so the guests ends up seeing "both" disks.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
de26b26197 hw/xen: Implement soft reset for emulated gnttab
This is only part of it; we will also need to get the PV back end drivers
to tear down their own mappings (or do it for them, but they kind of need
to stop using the pointers too).

Some more work on the actual PV back ends and xen-bus code is going to be
needed to really make soft reset and migration fully functional, and this
part is the basis for that.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
d05864d23b hw/xen: Map guest XENSTORE_PFN grant in emulated Xenstore
We don't actually access the guest's page through the grant, because
this isn't real Xen, and we can just use the page we gave it in the
first place. Map the grant anyway, mostly for cosmetic purposes so it
*looks* like it's in use in the guest-visible grant table.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
0324751272 hw/xen: Add emulated implementation of XenStore operations
Now that we have an internal implementation of XenStore, we can populate
the xenstore_backend_ops to allow PV backends to talk to it.

Watches can't be processed with immediate callbacks because that would
call back into XenBus code recursively. Defer them to a QEMUBH to be run
as appropriate from the main loop. We use a QEMUBH per XS handle, and it
walks all the watches (there shouldn't be many per handle) to fire any
which have pending events. We *could* have done it differently but this
allows us to use the same struct watch_event as we have for the guest
side, and keeps things relatively simple.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
b08d88e30f hw/xen: Add emulated implementation of grant table operations
This is limited to mapping a single grant at a time, because under Xen the
pages are mapped *contiguously* into qemu's address space, and that's very
hard to do when those pages actually come from anonymous mappings in qemu
in the first place.

Eventually perhaps we can look at using shared mappings of actual objects
for system RAM, and then we can make new mappings of the same backing
store (be it deleted files, shmem, whatever). But for now let's stick to
a page at a time.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
4dfd5fb178 hw/xen: Hook up emulated implementation for event channel operations
We provided the backend-facing evtchn functions very early on as part of
the core Xen platform support, since things like timers and xenstore need
to use them.

By what may or may not be an astonishing coincidence, those functions
just *happen* all to have exactly the right function prototypes to slot
into the evtchn_backend_ops table and be called by the PV backends.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
072519037d hw/xen: Only advertise ring-page-order for xen-block if gnttab supports it
Whem emulating Xen, multi-page grants are distinctly non-trivial and we
have elected not to support them for the time being. Don't advertise
them to the guest.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Paul Durrant
240cc11369 hw/xen: Avoid crash when backend watch fires too early
The xen-block code ends up calling aio_poll() through blkconf_geometry(),
which means we see watch events during the indirect call to
xendev_class->realize() in xen_device_realize(). Unfortunately this call
is made before populating the initial frontend and backend device nodes
in xenstore and hence xen_block_frontend_changed() (which is called from
a watch event) fails to read the frontend's 'state' node, and hence
believes the device is being torn down. This in-turn sets the backend
state to XenbusStateClosed and causes the device to be deleted before it
is fully set up, leading to the crash.
By simply moving the call to xendev_class->realize() after the initial
xenstore nodes are populated, this sorry state of affairs is avoided.

Reported-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
4ca8cf092d hw/xen: Build PV backend drivers for CONFIG_XEN_BUS
Now that we have the redirectable Xen backend operations we can build the
PV backends even without the Xen libraries.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
e2abfe5ec6 hw/xen: Rename xen_common.h to xen_native.h
This header is now only for native Xen code, not PV backends that may be
used in Xen emulation. Since the toolstack libraries may depend on the
specific version of Xen headers that they pull in (and will set the
__XEN_TOOLS__ macro to enable internal definitions that they depend on),
the rule is that xen_native.h (and thus the toolstack library headers)
must be included *before* any of the headers in include/hw/xen/interface.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
a9ae1418b3 hw/xen: Use XEN_PAGE_SIZE in PV backend drivers
XC_PAGE_SIZE comes from the actual Xen libraries, while XEN_PAGE_SIZE is
provided by QEMU itself in xen_backend_ops.h. For backends which may be
built for emulation mode, use the latter.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
7a8a749da7 hw/xen: Move xenstore_store_pv_console_info to xen_console.c
There's no need for this to be in the Xen accel code, and as we want to
use the Xen console support with KVM-emulated Xen we'll want to have a
platform-agnostic version of it. Make it use GString to build up the
path while we're at it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Paul Durrant
ba2a92db1f hw/xen: Add xenstore operations to allow redirection to internal emulation
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
15e283c5b6 hw/xen: Add foreignmem operations to allow redirection to internal emulation
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
f80fad16af hw/xen: Pass grant ref to gnttab unmap operation
The previous commit introduced redirectable gnttab operations fairly
much like-for-like, with the exception of the extra arguments to the
->open() call which were always NULL/0 anyway.

This *changes* the arguments to the ->unmap() operation to include the
original ref# that was mapped. Under real Xen it isn't necessary; all we
need to do from QEMU is munmap(), then the kernel will release the grant,
and Xen does the tracking/refcounting for the guest.

When we have emulated grant tables though, we need to do all that for
ourselves. So let's have the back ends keep track of what they mapped
and pass it in to the ->unmap() method for us.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
c412ba47b2 hw/xen: Add gnttab operations to allow redirection to internal emulation
Move the existing code using libxengnttab to xen-operations.c and allow
the operations to be redirected so that we can add emulation of grant
table mapping for backend drivers.

In emulation, mapping more than one grant ref to be virtually contiguous
would be fairly difficult. The best way to do it might be to make the
ram_block mappings actually backed by a file (shmem or a deleted file,
perhaps) so that we can have multiple *shared* mappings of it. But that
would be fairly intrusive.

Making the backend drivers cope with page *lists* instead of expecting
the mapping to be contiguous is also non-trivial, since some structures
would actually *cross* page boundaries (e.g. the 32-bit blkif responses
which are 12 bytes).

So for now, we'll support only single-page mappings in emulation. Add a
XEN_GNTTAB_OP_FEATURE_MAP_MULTIPLE flag to indicate that the native Xen
implementation *does* support multi-page maps, and a helper function to
query it.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
b6cacfea0b hw/xen: Add evtchn operations to allow redirection to internal emulation
The existing implementation calling into the real libxenevtchn moves to
a new file hw/xen/xen-operations.c, and is called via a function table
which in a subsequent commit will also be able to invoke the emulated
event channel support.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Paul Durrant
831b0db8ab hw/xen: Create initial XenStore nodes
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
766804b101 hw/xen: Implement core serialize/deserialize methods for xenstore_impl
This implements the basic migration support in the back end, with unit
tests that give additional confidence in the node-counting already in
the tree.

However, the existing PV back ends like xen-disk don't support migration
yet. They will reset the ring and fail to continue where they left off.
We will fix that in future, but not in time for the 8.0 release.

Since there's also an open question of whether we want to serialize the
full XenStore or only the guest-owned nodes in /local/domain/${domid},
for now just mark the XenStore device as unmigratable.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Paul Durrant
be1934dfef hw/xen: Implement XenStore permissions
Store perms as a GList of strings, check permissions.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
7cabbdb70d hw/xen: Watches on XenStore transactions
Firing watches on the nodes that still exist is relatively easy; just
walk the tree and look at the nodes with refcount of one.

Firing watches on *deleted* nodes is more fun. We add 'modified_in_tx'
and 'deleted_in_tx' flags to each node. Nodes with those flags cannot
be shared, as they will always be unique to the transaction in which
they were created.

When xs_node_walk would need to *create* a node as scaffolding and it
encounters a deleted_in_tx node, it can resurrect it simply by clearing
its deleted_in_tx flag. If that node originally had any *data*, they're
gone, and the modified_in_tx flag will have been set when it was first
deleted.

We then attempt to send appropriate watches when the transaction is
committed, properly delete the deleted_in_tx nodes, and remove the
modified_in_tx flag from the others.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
7248b87cb0 hw/xen: Implement XenStore transactions
Given that the whole thing supported copy on write from the beginning,
transactions end up being fairly simple. On starting a transaction, just
take a ref of the existing root; swap it back in on a successful commit.

The main tree has a transaction ID too, and we keep a record of the last
transaction ID given out. if the main tree is ever modified when it isn't
the latest, it gets a new transaction ID.

A commit can only succeed if the main tree hasn't moved on since it was
forked. Strictly speaking, the XenStore protocol allows a transaction to
succeed as long as nothing *it* read or wrote has changed in the interim,
but no implementations do that; *any* change is sufficient to abort a
transaction.

This does not yet fire watches on the changed nodes on a commit. That bit
is more fun and will come in a follow-on commit.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
6e1330090d hw/xen: Implement XenStore watches
Starts out fairly simple: a hash table of watches based on the path.

Except there can be multiple watches on the same path, so the watch ends
up being a simple linked list, and the head of that list is in the hash
table. Which makes removal a bit of a PITA but it's not so bad; we just
special-case "I had to remove the head of the list and now I have to
replace it in / remove it from the hash table". And if we don't remove
the head, it's a simple linked-list operation.

We do need to fire watches on *deleted* nodes, so instead of just a simple
xs_node_unref() on the topmost victim, we need to recurse down and fire
watches on them all.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
3ef7ff83ca hw/xen: Add basic XenStore tree walk and write/read/directory support
This is a fairly simple implementation of a copy-on-write tree.

The node walk function starts off at the root, with 'inplace == true'.
If it ever encounters a node with a refcount greater than one (including
the root node), then that node is shared with other trees, and cannot
be modified in place, so the inplace flag is cleared and we copy on
write from there on down.

Xenstore write has 'mkdir -p' semantics and will create the intermediate
nodes if they don't already exist, so in that case we flip the inplace
flag back to true as we populate the newly-created nodes.

We put a copy of the absolute path into the buffer in the struct walk_op,
with *two* NUL terminators at the end. As xs_node_walk() goes down the
tree, it replaces the next '/' separator with a NUL so that it can use
the 'child name' in place. The next recursion down then puts the '/'
back and repeats the exercise for the next path element... if it doesn't
hit that *second* NUL termination which indicates the true end of the
path.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
David Woodhouse
0254c4d19d hw/xen: Add xenstore wire implementation and implementation stubs
This implements the basic wire protocol for the XenStore commands, punting
all the actual implementation to xs_impl_* functions which all just return
errors for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
2023-03-07 17:04:30 +00:00
Peter Maydell
7b0f0aa55f * Fix missing memory barriers
* Fix comments about memory ordering
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmQHIqoUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPYBwgArUaS0KGrBM1XmRUUpXnJokmA37n8
 ft477na+XW+p9VYi27B0R01P8j+AkCrAO0Ir1MLG7axjn5KiRMnbf2uBgqasEREv
 repJEXsqISoxA6vvAvnehKHAI9zu8b7frRc/30b6EOrrZpn0JKePSNRTyBu2seGO
 NFDXPVA2Wom+xXaNSEGt0dmoJ6AzEVIZKhUIwyvUWOC7MXuuIkRWn9/nySUdvEt0
 RIFPPk7JCjnEc32vb4Xnq/Ncsy20tMIM1hlDxMOVNq3brjeSCzS0PPPSjE/X5OtW
 Yn5YS0nCyD7wjP2dkXI4I1lUPxUUx6LvMz1aGbJCfyjSX41mNES/agoGgA==
 =KEUo
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream-mb' of https://gitlab.com/bonzini/qemu into staging

* Fix missing memory barriers
* Fix comments about memory ordering

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmQHIqoUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPYBwgArUaS0KGrBM1XmRUUpXnJokmA37n8
# ft477na+XW+p9VYi27B0R01P8j+AkCrAO0Ir1MLG7axjn5KiRMnbf2uBgqasEREv
# repJEXsqISoxA6vvAvnehKHAI9zu8b7frRc/30b6EOrrZpn0JKePSNRTyBu2seGO
# NFDXPVA2Wom+xXaNSEGt0dmoJ6AzEVIZKhUIwyvUWOC7MXuuIkRWn9/nySUdvEt0
# RIFPPk7JCjnEc32vb4Xnq/Ncsy20tMIM1hlDxMOVNq3brjeSCzS0PPPSjE/X5OtW
# Yn5YS0nCyD7wjP2dkXI4I1lUPxUUx6LvMz1aGbJCfyjSX41mNES/agoGgA==
# =KEUo
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Mar 2023 11:40:26 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream-mb' of https://gitlab.com/bonzini/qemu:
  async: clarify usage of barriers in the polling case
  async: update documentation of the memory barriers
  physmem: add missing memory barrier
  qemu-coroutine-lock: add smp_mb__after_rmw()
  aio-wait: switch to smp_mb__after_rmw()
  edu: add smp_mb__after_rmw()
  qemu-thread-win32: cleanup, fix, document QemuEvent
  qemu-thread-posix: cleanup, fix, document QemuEvent
  qatomic: add smp_mb__before/after_rmw()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-03-07 17:02:06 +00:00
Karthikeyan Pasupathi
7840ba985a hw/arm/aspeed: Modified BMC FRU byte data in yosemitev2
Modified BMC FRU data in yosemite v2 platform.

Tested: Tested and Verified in yosemitev2 platform.

Fixes: 34f73a81e6 ("hw/arm/aspeed: Adding new machine Yosemitev2 in QEMU")
Signed-off-by: Karthikeyan Pasupathi <pkarthikeyan1509@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230307104833.3587947-1-pkarthikeyan1509@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-03-07 16:53:18 +01:00
Karthikeyan Pasupathi
a09d357dd3 hw/arm/aspeed: Added TMP421 type sensor's support in tiogapass
Added TMP421 type sensor support in tiogapass platform.

Tested: Tested and verified in tiogapass platform.

Signed-off-by: Karthikeyan Pasupathi <pkarthikeyan1509@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230307103334.3586755-1-pkarthikeyan1509@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-03-07 16:53:18 +01:00