qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Stefan Hajnoczi	a4eef0711b	vhost-user-blk-pci: default num_queues to -smp N Automatically size the number of request virtqueues to match the number of vCPUs. This ensures that completion interrupts are handled on the same vCPU that submitted the request. No IPI is necessary to complete an I/O request and performance is improved. The maximum number of MSI-X vectors and virtqueues limit are respected. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20200818143348.310613-8-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-08-27 08:29:13 -04:00
Stefan Hajnoczi	9445e1e15e	virtio-blk-pci: default num_queues to -smp N Automatically size the number of virtio-blk-pci request virtqueues to match the number of vCPUs. Other transports continue to default to 1 request virtqueue. A 1:1 virtqueue:vCPU mapping ensures that completion interrupts are handled on the same vCPU that submitted the request. No IPI is necessary to complete an I/O request and performance is improved. The maximum number of MSI-X vectors and virtqueues limit are respected. Performance improves from 78k to 104k IOPS on a 32 vCPU guest with 101 virtio-blk-pci devices (ioengine=libaio, iodepth=1, bs=4k, rw=randread with NVMe storage). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Message-Id: <20200818143348.310613-7-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-08-27 08:29:13 -04:00
Stefan Hajnoczi	6a55882284	virtio-scsi-pci: default num_queues to -smp N Automatically size the number of virtio-scsi-pci, vhost-scsi-pci, and vhost-user-scsi-pci request virtqueues to match the number of vCPUs. Other transports continue to default to 1 request virtqueue. A 1:1 virtqueue:vCPU mapping ensures that completion interrupts are handled on the same vCPU that submitted the request. No IPI is necessary to complete an I/O request and performance is improved. The maximum number of MSI-X vectors and virtqueues limit are respected. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200818143348.310613-6-stefanha@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-08-27 08:29:13 -04:00
Cornelia Huck	3ff3c5d317	hw: add compat machines for 5.2 Add 5.2 machine types for arm/i440fx/q35/s390x/spapr. Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200819144016.281156-1-cohuck@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-08-19 10:45:48 -04:00
Hogan Wang	2ebc21216f	hw/pci-host: save/restore pci host config register The pci host config register is used to save PCI address for read/write config data. If guest writes a value to config register, and then QEMU pauses the vcpu to migrate, after the migration, the guest will continue to write pci config data, and the write data will be ignored because of new qemu process losing the config register state. To trigger the bug: 1. guest is booting in seabios. 2. guest enables the SMRAM in seabios:piix4_apmc_smm_setup, and then expects to disable the SMRAM by pci_config_writeb. 3. after guest writes the pci host config register, QEMU pauses vcpu to finish migration. 4. guest write of config data(0x0A) fails to disable the SMRAM because the config register state is lost. 5. guest continues to boot and crashes in ipxe option ROM due to SMRAM in enabled state. Example Reproducer: step 1. Make modifications to seabios and qemu for increase reproduction efficiency, write 0xf0 to 0x402 port notify qemu to stop vcpu after 0x0cf8 port wrote i440 configure register. qemu stop vcpu when catch 0x402 port wrote 0xf0. seabios:/src/hw/pci.c @@ -52,6 +52,11 @@ void pci_config_writeb(u16 bdf, u32 addr, u8 val) writeb(mmconfig_addr(bdf, addr), val); } else { outl(ioconfig_cmd(bdf, addr), PORT_PCI_CMD); + if (bdf == 0 && addr == 0x72 && val == 0xa) { + dprintf(1, "stop vcpu\n"); + outb(0xf0, 0x402); // notify qemu to stop vcpu + dprintf(1, "resume vcpu\n"); + } outb(val, PORT_PCI_DATA + (addr & 3)); } } qemu:hw/char/debugcon.c @@ -60,6 +61,9 @@ static void debugcon_ioport_write(void opaque, hwaddr addr, uint64_t val, printf(" [debugcon: write addr=0x%04" HWADDR_PRIx " val=0x%02" PRIx64 "]\n", addr, val); #endif + if (ch == 0xf0) { + vm_stop(RUN_STATE_PAUSED); + } / XXX this blocks entire thread. Rewrite to use * qemu_chr_fe_write and background I/O callbacks */ qemu_chr_fe_write_all(&s->chr, &ch, 1); step 2. start vm1 by the following command line, and then vm stopped. $ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\ -netdev tap,ifname=tap-test,id=hostnet0,vhost=on,downscript=no,script=no\ -device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\ -device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\ -chardev file,id=seabios,path=/var/log/test.seabios,append=on\ -device isa-debugcon,iobase=0x402,chardev=seabios\ -monitor stdio step 3. start vm2 to accept vm1 state. $ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\ -netdev tap,ifname=tap-test1,id=hostnet0,vhost=on,downscript=no,script=no\ -device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\ -device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\ -chardev file,id=seabios,path=/var/log/test.seabios,append=on\ -device isa-debugcon,iobase=0x402,chardev=seabios\ -monitor stdio \ -incoming tcp:127.0.0.1:8000 step 4. execute the following qmp command in vm1 to migrate. (qemu) migrate tcp:127.0.0.1:8000 step 5. execute the following qmp command in vm2 to resume vcpu. (qemu) cont Before this patch, we get KVM "emulation failure" error on vm2. This patch fixes it. Cc: qemu-stable@nongnu.org Signed-off-by: Hogan Wang <hogan.wang@huawei.com> Message-Id: <20200727084621.3279-1-hogan.wang@huawei.com> Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-07-27 10:24:39 -04:00
Markus Armbruster	7a309cc95b	qom: Change object_get_canonical_path_component() not to malloc object_get_canonical_path_component() returns a malloced copy of a property name on success, null on failure. 19 of its 25 callers immediately free the returned copy. Change object_get_canonical_path_component() to return the property name directly. Since modifying the name would be wrong, adjust the return type to const char *. Drop the free from the 19 callers become simpler, add the g_strdup() to the other six. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200714160202.3121879-4-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Li Qiang <liq3ea@gmail.com>	2020-07-21 16:23:43 +02:00
Markus Armbruster	668f62ec62	error: Eliminate error_propagate() with Coccinelle, part 1 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... return ... } to if (!foo(..., errp)) { ... ... return ... } where nothing else needs @err. Coccinelle script: @rule1 forall@ identifier fun, err, errp, lbl; expression list args, args2; binary operator op; constant c1, c2; symbol false; @@ if ( ( - fun(args, &err, args2) + fun(args, errp, args2) \| - !fun(args, &err, args2) + !fun(args, errp, args2) \| - fun(args, &err, args2) op c1 + fun(args, errp, args2) op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; ) } @rule2 forall@ identifier fun, err, errp, lbl; expression list args, args2; expression var; binary operator op; constant c1, c2; symbol false; @@ - var = fun(args, &err, args2); + var = fun(args, errp, args2); ... when != err if ( ( var \| !var \| var op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; \| return var; ) } @depends on rule1 \|\| rule2@ identifier err; @@ - Error *err = NULL; ... when != err Not exactly elegant, I'm afraid. The "when != lbl:" is necessary to avoid transforming if (fun(args, &err)) { goto out } ... out: error_propagate(errp, err); even though other paths to label out still need the error_propagate(). For an actual example, see sclp_realize(). Without the "when strict", Coccinelle transforms vfio_msix_setup(), incorrectly. I don't know what exactly "when strict" does, only that it helps here. The match of return is narrower than what I want, but I can't figure out how to express "return where the operand doesn't use @err". For an example where it's too narrow, see vfio_intx_enable(). Silently fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. Line breaks tidied up manually. One nested declaration of @local_err deleted manually. Preexisting unwanted blank line dropped in hw/riscv/sifive_e.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-35-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	62a35aaa31	qapi: Use returned bool to check for failure, Coccinelle part The previous commit enables conversion of visit_foo(..., &err); if (err) { ... } to if (!visit_foo(..., errp)) { ... } for visitor functions that now return true / false on success / error. Coccinelle script: @@ identifier fun =~ "check_list\|input_type_enum\|lv_start_struct\|lv_type_bool\|lv_type_int64\|lv_type_str\|lv_type_uint64\|output_type_enum\|parse_type_bool\|parse_type_int64\|parse_type_null\|parse_type_number\|parse_type_size\|parse_type_str\|parse_type_uint64\|print_type_bool\|print_type_int64\|print_type_null\|print_type_number\|print_type_size\|print_type_str\|print_type_uint64\|qapi_clone_start_alternate\|qapi_clone_start_list\|qapi_clone_start_struct\|qapi_clone_type_bool\|qapi_clone_type_int64\|qapi_clone_type_null\|qapi_clone_type_number\|qapi_clone_type_str\|qapi_clone_type_uint64\|qapi_dealloc_start_list\|qapi_dealloc_start_struct\|qapi_dealloc_type_anything\|qapi_dealloc_type_bool\|qapi_dealloc_type_int64\|qapi_dealloc_type_null\|qapi_dealloc_type_number\|qapi_dealloc_type_str\|qapi_dealloc_type_uint64\|qobject_input_check_list\|qobject_input_check_struct\|qobject_input_start_alternate\|qobject_input_start_list\|qobject_input_start_struct\|qobject_input_type_any\|qobject_input_type_bool\|qobject_input_type_bool_keyval\|qobject_input_type_int64\|qobject_input_type_int64_keyval\|qobject_input_type_null\|qobject_input_type_number\|qobject_input_type_number_keyval\|qobject_input_type_size_keyval\|qobject_input_type_str\|qobject_input_type_str_keyval\|qobject_input_type_uint64\|qobject_input_type_uint64_keyval\|qobject_output_start_list\|qobject_output_start_struct\|qobject_output_type_any\|qobject_output_type_bool\|qobject_output_type_int64\|qobject_output_type_null\|qobject_output_type_number\|qobject_output_type_str\|qobject_output_type_uint64\|start_list\|visit_check_list\|visit_check_struct\|visit_start_alternate\|visit_start_list\|visit_start_struct\|visit_type_."; expression list args; typedef Error; Error err; @@ - fun(args, &err); - if (err) + if (!fun(args, &err)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-19-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Paolo Bonzini	f983ff95f4	vmport: move compat properties to hw_compat_5_0 The patches that introduced the properties were submitted when QEMU 5.0 had not been released yet, so they got merged under the wrong heading. Move them to hw_compat_5_0 so that 5.0 machine types get the pre-patch behavior. Fixes: `b889212973` ("hw/i386/vmport: Propagate IOPort read to vCPU EAX register") Fixes: `0342ee761e` ("hw/i386/vmport: Set EAX to -1 on failed and unsupported commands") Fixes: `f8bdc55037` ("hw/i386/vmport: Report vmware-vmx-type in CMD_GETVERSION") Fixes: `aaacf1c15a` ("hw/i386/vmport: Add support for CMD_GETBIOSUUID") Reported-by: Laurent Vivier <lvivier@redhat.com> Cc: Liran Alon <liran.alon@oracle.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-26 09:39:40 -04:00
Peter Maydell	7d3660e798	* Miscellaneous fixes and feature enablement (many) * SEV refactoring (David) * Hyper-V initial support (Jon) * i386 TCG fixes (x87 and SSE, Joseph) * vmport cleanup and improvements (Philippe, Liran) * Use-after-free with vCPU hot-unplug (Nengyuan) * run-coverity-scan improvements (myself) * Record/replay fixes (Pavel) * -machine kernel_irqchip=split improvements for INTx (Peter) * Code cleanups (Philippe) * Crash and security fixes (PJP) * HVF cleanups (Roman) -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl7jpdAUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMfjwf/X7+0euuE9dwKFKDDMmIi+4lRWnq7 gSOyE1BYSfDIUXRIukf64konXe0VpiotNYlyEaYnnQjkMdGm5E9iXKF+LgEwXj/t NSGkfj5J3VeWRG4JJp642CSN/aZWO8uzkenld3myCnu6TicuN351tDJchiFwAk9f wsXtgLKd67zE8MLVt8AP0rNTbzMHttPXnPaOXDCuwjMHNvMEKnC93UeOeM0M4H5s 3Dl2HvsNWZ2SzUG9mAbWp0bWWuoIb+Ep9//87HWANvb7Z8jratRws18i6tYt1sPx 8zOnUS87sVnh1CQlXBDd9fEcqBUVgR9pAlqaaYavNhFp5eC31euvpDU8Iw== =F4sU -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * Miscellaneous fixes and feature enablement (many) * SEV refactoring (David) * Hyper-V initial support (Jon) * i386 TCG fixes (x87 and SSE, Joseph) * vmport cleanup and improvements (Philippe, Liran) * Use-after-free with vCPU hot-unplug (Nengyuan) * run-coverity-scan improvements (myself) * Record/replay fixes (Pavel) * -machine kernel_irqchip=split improvements for INTx (Peter) * Code cleanups (Philippe) * Crash and security fixes (PJP) * HVF cleanups (Roman) # gpg: Signature made Fri 12 Jun 2020 16:57:04 BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (116 commits) target/i386: Remove obsolete TODO file stubs: move Xen stubs to accel/ replay: fix replay shutdown for console mode exec/cpu-common: Move MUSB specific typedefs to 'hw/usb/hcd-musb.h' hw/usb: Move device-specific declarations to new 'hcd-musb.h' header exec/memory: Remove unused MemoryRegionMmio type checkpatch: reversed logic with acpi test checks target/i386: sev: Unify SEVState and SevGuestState target/i386: sev: Remove redundant handle field target/i386: sev: Remove redundant policy field target/i386: sev: Remove redundant cbitpos and reduced_phys_bits fields target/i386: sev: Partial cleanup to sev_state global target/i386: sev: Embed SEVState in SevGuestState target/i386: sev: Rename QSevGuestInfo target/i386: sev: Move local structure definitions into .c file target/i386: sev: Remove unused QSevGuestInfoClass xen: fix build without pci passthrough i386: hvf: Drop HVFX86EmulatorState i386: hvf: Move mmio_buf into CPUX86State i386: hvf: Move lazy_flags into CPUX86State ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # hw/i386/acpi-build.c	2020-06-12 23:06:22 +01:00
Liran Alon	aaacf1c15a	hw/i386/vmport: Add support for CMD_GETBIOSUUID This is VMware documented functionallity that some guests rely on. Returns the BIOS UUID of the current virtual machine. Note that we also introduce a new compatability flag "x-cmds-v2" to make sure to expose new VMPort commands only to new machine-types. This flag will also be used by the following patches that will introduce additional VMPort commands. Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20200312165431.82118-10-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-10 12:09:47 -04:00
Liran Alon	f8bdc55037	hw/i386/vmport: Report vmware-vmx-type in CMD_GETVERSION As can be seen from VmCheck_GetVersion() in open-vm-tools code, CMD_GETVERSION should return vmware-vmx-type in ECX register. Default is to fake host as VMware ESX server. But user can control this value by "-global vmport.vmware-vmx-type=X". Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20200312165431.82118-7-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-10 12:09:45 -04:00
Liran Alon	0342ee761e	hw/i386/vmport: Set EAX to -1 on failed and unsupported commands This is used as a signal for VMware Tools to know if a command it attempted to invoke, failed or is unsupported. As a result, VMware Tools will either report failure to user or fallback to another backdoor command in attempt to perform some operation. A few examples: * open-vm-tools TimeSyncReadHost() function fallbacks to CMD_GETTIMEFULL command when CMD_GETTIMEFULL_WITH_LAG fails/unsupported. * open-vm-tools Hostinfo_NestingSupported() function verifies EAX != -1 to check for success. * open-vm-tools Hostinfo_VCPUInfoBackdoor() functions checks if reserved-bit is set to indicate command is unimplemented. Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20200312165431.82118-5-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-10 12:09:44 -04:00
Liran Alon	b889212973	hw/i386/vmport: Propagate IOPort read to vCPU EAX register vmport_ioport_read() returns the value that should propagate to vCPU EAX register when guest reads VMPort IOPort (i.e. By x86 IN instruction). However, because vmport_ioport_read() calls cpu_synchronize_state(), the returned value gets overridden by the value in QEMU vCPU EAX register. i.e. cpu->env.regs[R_EAX]. To fix this issue, change vmport_ioport_read() to explicitly override cpu->env.regs[R_EAX] with the value it wish to propagate to vCPU EAX register. Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20200312165431.82118-4-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-10 12:09:43 -04:00
Alexander Duyck	7483cbbaf8	virtio-balloon: Implement support for page poison reporting feature We need to make certain to advertise support for page poison reporting if we want to actually get data on if the guest will be poisoning pages. Add a value for reporting the poison value being used if page poisoning is enabled in the guest. With this we can determine if we will need to skip free page reporting when it is enabled in the future. The value currently has no impact on existing balloon interfaces. In the case of existing balloon interfaces the onus is on the guest driver to reapply whatever poison is in place. When we add free page reporting the poison value is used to determine if we can perform in-place page reporting. The expectation is that a reported page will already contain the value specified by the poison, and the reporting of the page should not change that value. Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Message-Id: <20200527041400.12700.33251.stgit@localhost.localdomain>	2020-06-09 14:18:04 -04:00
Markus Armbruster	d2623129a7	qom: Drop parameter @errp of object_property_add() & friends The only way object_property_add() can fail is when a property with the same name already exists. Since our property names are all hardcoded, failure is a programming error, and the appropriate way to handle it is passing &error_abort. Same for its variants, except for object_property_add_child(), which additionally fails when the child already has a parent. Parentage is also under program control, so this is a programming error, too. We have a bit over 500 callers. Almost half of them pass &error_abort, slightly fewer ignore errors, one test case handles errors, and the remaining few callers pass them to their own callers. The previous few commits demonstrated once again that ignoring programming errors is a bad idea. Of the few ones that pass on errors, several violate the Error API. The Error ** argument must be NULL, &error_abort, &error_fatal, or a pointer to a variable containing NULL. Passing an argument of the latter kind twice without clearing it in between is wrong: if the first call sets an error, it no longer points to NULL for the second call. ich9_pm_add_properties(), sparc32_ledma_realize(), sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize() are wrong that way. When the one appropriate choice of argument is &error_abort, letting users pick the argument is a bad idea. Drop parameter @errp and assert the preconditions instead. There's one exception to "duplicate property name is a programming error": the way object_property_add() implements the magic (and undocumented) "automatic arrayification". Don't drop @errp there. Instead, rename object_property_add() to object_property_try_add(), and add the obvious wrapper object_property_add(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-15-armbru@redhat.com> [Two semantic rebase conflicts resolved]	2020-05-15 07:07:58 +02:00
Markus Armbruster	7eecec7d12	qom: Drop object_property_set_description() parameter @errp object_property_set_description() and object_class_property_set_description() fail only when property @name is not found. There are 85 calls of object_property_set_description() and object_class_property_set_description(). None of them can fail: * 84 immediately follow the creation of the property. * The one in spapr_rng_instance_init() refers to a property created in spapr_rng_class_init(), from spapr_rng_properties[]. Every one of them still gets to decide what to pass for @errp. 51 calls pass &error_abort, 32 calls pass NULL, one receives the error and propagates it to &error_abort, and one propagates it to &error_fatal. I'm actually surprised none of them violates the Error API. What are we gaining by letting callers handle the "property not found" error? Use when the property is not known to exist is simpler: you don't have to guard the call with a check. We haven't found such a use in 5+ years. Until we do, let's make life a bit simpler and drop the @errp parameter. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-8-armbru@redhat.com> [One semantic rebase conflict resolved]	2020-05-15 07:06:49 +02:00
Cornelia Huck	541aaa1df8	hw: add compat machines for 5.1 Add 5.1 machine types for arm/i440fx/q35/s390x/spapr. Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-id: 20200429144605.7262-1-cohuck@redhat.com	2020-05-06 10:12:16 -04:00
Shameer Kolothum	394f0f72fd	fw_cfg: Migrate ACPI table mr sizes separately Any sub-page size update to ACPI MRs will be lost during migration, as we use aligned size in ram_load_precopy() -> qemu_ram_resize() path. This will result in inconsistency in FWCfgEntry sizes between source and destination. In order to avoid this, save and restore them separately during migration. Up until now, this problem may not be that relevant for x86 as both ACPI table and Linker MRs gets padded and aligned. Also at present, qemu_ram_resize() doesn't invoke callback to update FWCfgEntry for unaligned size changes. But since we are going to fix the qemu_ram_resize() in the subsequent patch, the issue may become more serious especially for RSDP MR case. Moreover, the issue will soon become prominent in arm/virt as well where the MRs are not padded or aligned at all and eventually have acpi table changes as part of future additions like NVDIMM hot-add feature. Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Message-Id: <20200403101827.30664-3-shameerali.kolothum.thodi@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-04-13 06:55:54 -04:00
Cornelia Huck	02501fc393	compat: disable edid on correct virtio-gpu device Commit `bb15791166` ("compat: disable edid on virtio-gpu base device") tried to disable 'edid' on the virtio-gpu base device. However, that device is not 'virtio-gpu', but 'virtio-gpu-device'. Fix it. Fixes: `bb15791166` ("compat: disable edid on virtio-gpu base device") Reported-by: Lukáš Doktor <ldoktor@redhat.com> Tested-by: Lukáš Doktor <ldoktor@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-id: 20200318093919.24942-1-cohuck@redhat.com Cc: qemu-stable@nongnu.org Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2020-03-20 07:50:52 +01:00
Babu Moger	8cb30e3aec	machine: Add SMP Sockets in CpuTopology Store the smp sockets in CpuTopology. The socket information required to build the apic id in EPYC mode. Right now socket information is not passed to down when decoding the apic id. Add the socket information here. Signed-off-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <158396718647.58170.2278448323151215741.stgit@naples-babu.amd.com>	2020-03-17 19:48:10 -04:00
Dr. David Alan Gilbert	4ba59be1d6	machine/memory encryption: Disable mem merge When a host is running with memory encryption, the memory isn't visible to the host kernel; attempts to merge that memory are futile because what it's really comparing is encrypted memory, usually encrypted with different keys. Automatically turn mem-merge off when memory encryption is specified. https://bugzilla.redhat.com/show_bug.cgi?id=1796356 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20200130175046.85850-1-dgilbert@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-03-17 19:48:10 -04:00
Paolo Bonzini	ca6155c0f2	Merge tag 'patchew/20200219160953.13771-1-imammedo@redhat.com' of https://github.com/patchew-project/qemu into HEAD This series removes ad hoc RAM allocation API (memory_region_allocate_system_memory) and consolidates it around hostmem backend. It allows to * resolve conflicts between global -mem-prealloc and hostmem's "policy" option, fixing premature allocation before binding policy is applied * simplify complicated memory allocation routines which had to deal with 2 ways to allocate RAM. * reuse hostmem backends of a choice for main RAM without adding extra CLI options to duplicate hostmem features. A recent case was -mem-shared, to enable vhost-user on targets that don't support hostmem backends [1] (ex: s390) * move RAM allocation from individual boards into generic machine code and provide them with prepared MemoryRegion. * clean up deprecated NUMA features which were tied to the old API (see patches) - "numa: remove deprecated -mem-path fallback to anonymous RAM" - (POSTPONED, waiting on libvirt side) "forbid '-numa node,mem' for 5.0 and newer machine types" - (POSTPONED) "numa: remove deprecated implicit RAM distribution between nodes" Introduce a new machine.memory-backend property and wrapper code that aliases global -mem-path and -mem-alloc into automatically created hostmem backend properties (provided memory-backend was not set explicitly given by user). A bulk of trivial patches then follow to incrementally convert individual boards to using machine.memory-backend provided MemoryRegion. Board conversion typically involves: * providing MachineClass::default_ram_size and MachineClass::default_ram_id so generic code could create default backend if user didn't explicitly provide memory-backend or -m options * dropping memory_region_allocate_system_memory() call * using convenience MachineState::ram MemoryRegion, which points to MemoryRegion allocated by ram-memdev On top of that for some boards: * missing ram_size checks are added (typically it were boards with fixed ram size) * ram_size fixups are replaced by checks and hard errors, forcing user to provide correct "-m" values instead of ignoring it and continuing running. After all boards are converted, the old API is removed and memory allocation routines are cleaned up.	2020-02-25 09:19:00 +01:00
Denis Plotnikov	c9b7d9ec21	virtio: increase virtqueue size for virtio-scsi and virtio-blk The goal is to reduce the amount of requests issued by a guest on 1M reads/writes. This rises the performance up to 4% on that kind of disk access pattern. The maximum chunk size to be used for the guest disk accessing is limited with seg_max parameter, which represents the max amount of pices in the scatter-geather list in one guest disk request. Since seg_max is virqueue_size dependent, increasing the virtqueue size increases seg_max, which, in turn, increases the maximum size of data to be read/write from a guest disk. More details in the original problem statment: https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html Suggested-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com> Message-id: 20200214074648.958-1-dplotnikov@virtuozzo.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-02-22 08:26:47 +00:00
Igor Mammedov	82b911aaff	machine: introduce convenience MachineState::ram the new field will be used by boards to get access to main RAM memory region and will help to save boiler plate in boards which often introduce a field or variable just for this purpose. Memory region will be equivalent to what currently used memory_region_allocate_system_memory() is returning apart from that it will come from hostmem backend. Followup patches will incrementally switch boards to using RAM from MachineState::ram. Patch takes care of non-NUMA case and follow up patch will initialize MachineState::ram for NUMA case. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20200219160953.13771-5-imammedo@redhat.com>	2020-02-19 16:49:53 +00:00
Igor Mammedov	aa8b183974	machine: introduce memory-backend property Property will contain link to memory backend that will be used for backing initial RAM. Follow up commit will alias -mem-path and -mem-prealloc CLI options into memory backend options to make memory handling consistent (using only hostmem backend family for guest RAM allocation). Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200219160953.13771-3-imammedo@redhat.com>	2020-02-19 16:49:53 +00:00
Gerd Hoffmann	ed71c09ffd	qxl: introduce hardware revision 5 The only difference to hardware revision 4 is that the device doesn't switch to VGA mode in case someone happens to touch a VGA register, which should make things more robust in configurations with multiple vga devices. Swtiching back to VGA mode happens on reset, either full machine reset or qxl device reset (QXL_IO_RESET ioport command). Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Message-id: 20200206074358.4274-1-kraxel@redhat.com	2020-02-13 08:31:40 +01:00
Philippe Mathieu-Daudé	11a18c84db	hw/core: Allow setting 'virtio-blk-device.scsi' property on OSX host Commit `ed65fd1a27` ("virtio-blk: switch off scsi-passthrough by default") changed the default value of the 'scsi' property of virtio-blk, which is only available on Linux hosts. It also added an unconditional compat entry for 2.4 or earlier machines. Trying to set this property on a pre-2.5 machine on OSX, we get: Unexpected error in object_property_find() at qom/object.c:1201: qemu-system-x86_64: -device virtio-blk-pci,id=scsi0,drive=drive0: can't apply global virtio-blk-device.scsi=true: Property '.scsi' not found Fix this error by marking the property optional. Fixes: `ed65fd1a27` ("virtio-blk: switch off scsi-passthrough by default") Suggested-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-id: 20200207001404.1739-1-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-02-07 16:49:39 +00:00
Yuri Benditovich	32187f3d90	usb-redir: remove 'remote wakeup' flag from configuration descriptor If the redirected device has this capability, Windows guest may place the device into D2 and expect it to wake when the device becomes active, but this will never happen. For example, when internal Bluetooth adapter is redirected, keyboards and mice connected to it do not work. Current commit removes this capability (starting from machine 5.0) Set 'usb-redir.suppress-remote-wake' property to 'off' to keep 'remote wake' as is or to 'on' to remove 'remote wake' on 4.2 or earlier. Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Message-id: 20200108091044.18055-3-yuri.benditovich@daynix.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2020-01-13 09:17:31 +01:00
Yuri Benditovich	7bacaf5fea	usb-host: remove 'remote wakeup' flag from configuration descriptor If the redirected device has this capability, Windows guest may place the device into D2 and expect it to wake when the device becomes active, but this will never happen. For example, when internal Bluetooth adapter is redirected, keyboards and mice connected to it do not work. Current commit removes this capability (starting from machine 5.0) Set 'usb-host.suppress-remote-wake' property to 'off' to keep 'remote wake' as is or to 'on' to remove 'remote wake' on 4.2 or earlier. Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Message-id: 20200108091044.18055-2-yuri.benditovich@daynix.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2020-01-13 09:17:31 +01:00
Peter Maydell	973d306dd6	virtio, pci, pc: fixes, features Bugfixes all over the place. HMAT support. New flags for vhost-user-blk utility. Auto-tuning of seg max for virtio storage. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl4TaMEPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpvzgH/2LyDAzCa9h93ikSJjmyUk5FUaqve38daEb3 S3JYjwKxQx7u1ydooKhvBQnBCZ2i3S+k62gfYyKB+nBv8xvjs0Eg5D1YJ5E8hciy lf5OFGWWtX2iPDjZwQwT13kiJe0o3JRGxJJ6XqTEG+1EYOp7cky/FEv4PD030b9m I2wROZ/Am+onB9YJX8c0Vv1CG+AryuJNXnvwQzTXEjj4U7bEYUyJwVZaCRyAdWQ3 uYXIZN9VwjVX6BFvy9ZAJbEsUVJvOM1/aQaDqcrLz+VlzRT7bRkKHi2G3vakrm1I r5OpgyLo84132awCncbSykKDH5o8WaxLaJBjGmuBfasMz9wPzAg= =uL1o -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging virtio, pci, pc: fixes, features Bugfixes all over the place. HMAT support. New flags for vhost-user-blk utility. Auto-tuning of seg max for virtio storage. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Mon 06 Jan 2020 17:05:05 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (32 commits) intel_iommu: add present bit check for pasid table entries intel_iommu: a fix to vtd_find_as_from_bus_num() virtio-net: delete also control queue when TX/RX deleted virtio: reset region cache when on queue deletion virtio-mmio: update queue size on guest write tests: add virtio-scsi and virtio-blk seg_max_adjust test virtio: make seg_max virtqueue size dependent hw: fix using 4.2 compat in 5.0 machine types for i440fx/q35 vhost-user-scsi: reset the device if supported vhost-user: add VHOST_USER_RESET_DEVICE to reset devices hw/pci/pci_host: Let pci_data_[read/write] use unsigned 'size' argument hw/pci/pci_host: Remove redundant PCI_DPRINTF() virtio-mmio: Clear v2 transport state on soft reset ACPI: add expected files for HMAT tests (acpihmat) tests/bios-tables-test: add test cases for ACPI HMAT tests/numa: Add case for QMP build HMAT hmat acpi: Build Memory Side Cache Information Structure(s) hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s) hmat acpi: Build Memory Proximity Domain Attributes Structure(s) numa: Extend CLI to provide memory side cache information ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-01-07 16:25:00 +00:00
Denis Plotnikov	1bf8a989a5	virtio: make seg_max virtqueue size dependent Before the patch, seg_max parameter was immutable and hardcoded to 126 (128 - 2) without respect to queue size. This has two negative effects: 1. when queue size is < 128, we have Virtio 1.1 specfication violation: (2.6.5.3.1 Driver Requirements) seq_max must be <= queue_size. This violation affects the old Linux guests (ver < 4.14). These guests crash on these queue_size setups. 2. when queue_size > 128, as was pointed out by Denis Lunev <den@virtuozzo.com>, seg_max restrics guest's block request length which affects guests' performance making them issues more block request than needed. https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html To mitigate this two effects, the patch adds the property adjusting seg_max to queue size automaticaly. Since seg_max is a guest visible parameter, the property is machine type managable and allows to choose between old (seg_max = 126 always) and new (seg_max = queue_size - 2) behaviors. Not to change the behavior of the older VMs, prevent setting the default seg_max_adjust value for older machine types. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com> Message-Id: <20191220140905.1718-2-dplotnikov@virtuozzo.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-01-06 12:04:43 -05:00
Peter Maydell	6fb0dae9ef	x86 and machine queue, 2019-12-20 Bug fix: * Resolve CPU models to v1 by default (Eduardo Habkost) Cleanup: * Remove incorrect numa_mem_supported checks (Igor Mammedov) -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEEWjIv1avE09usz9GqKAeTb5hNxaYFAl39HqYUHGVoYWJrb3N0 QHJlZGhhdC5jb20ACgkQKAeTb5hNxabHeBAAkybU8+JzzqXoG9e16MiQiUQ0vSy9 MFkWIlsD5RCncdlI7s7yyuPUa7GEkJztRxzanvP2BcbMvHHpaM01EgOsZuZfld8Z R6lQaTZdAC4XQFPmD14ccIQ/r8cDUXRfUhasKXq3tNQdXORUw5/T9XHwyn3kvHUT /nEglWdUG0LmRQMQNRpbSgQ4B0jx+RwRg6KLGRm/mqlwiFV8nULLB8IYDMrxHSu3 iY/PAFOMqQbRbbDQ7rK3l7u0TyRTB41FTx8s2eT9Is2V3HZU9P9lbWPQBMnxPwxm VYo/LVO6smZ9gZbyCcPZtOn95ay5gGk+fQ9Twg6/l1tHsK7vmNxn8Z3y+QWEvJ30 BnOJ2Y0RaFNBDrhiIqJu12Lp0nJXMDi96tAS71hqwsJssjzLYSpD/faoKO0vDyR9 RLoumrXLcrgeMopRKsft8ZkJIakHlXc+85AuIMZ9obhcz4liC7r/IbjOqOumKTPN 8feLmzqdldAmh0jvJCfyu1n4qhH4KUPPrFxOvZfuzdWkvSUbcJSkQaPwYxxQaFvo 9jRHwNNF4MTnImgQIw59ao/u6JXVM+4oY5dc+BjeGTefQKuwRRvT/54Z+v7jULwK ZKGlLnCRlYeD/U+67iBIeV2nrRM7pTkcsTWmhX+/u2pwyKpmiA4quG63KmR7dyDK 6HqJez6jOKTEARU= =psTk -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ehabkost/tags/x86-and-machine-pull-request' into staging x86 and machine queue, 2019-12-20 Bug fix: * Resolve CPU models to v1 by default (Eduardo Habkost) Cleanup: * Remove incorrect numa_mem_supported checks (Igor Mammedov) # gpg: Signature made Fri 20 Dec 2019 19:19:02 GMT # gpg: using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6 # gpg: issuer "ehabkost@redhat.com" # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full] # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * remotes/ehabkost/tags/x86-and-machine-pull-request: numa: properly check if numa is supported numa: remove not needed check i386: Resolve CPU models to v1 by default Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-01-06 14:08:04 +00:00
Tao Xu	244b3f4485	numa: Extend CLI to provide initiator information for numa nodes In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT), The initiator represents processor which access to memory. And in 5.2.27.3 Memory Proximity Domain Attributes Structure, the attached initiator is defined as where the memory controller responsible for a memory proximity domain. With attached initiator information, the topology of heterogeneous memory can be described. Add new machine property 'hmat' to enable all HMAT specific options. Extend CLI of "-numa node" option to indicate the initiator numa node-id. In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report the platform's HMAT tables. Before using initiator option, enable HMAT with -machine hmat=on. Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Jingqi Liu <jingqi.liu@intel.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20191213011929.2520-2-tao3.xu@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-01-05 07:03:03 -05:00
Michael Roth	9d7bd0826f	virtio-pci: disable vring processing when bus-mastering is disabled Currently the SLOF firmware for pseries guests will disable/re-enable a PCI device multiple times via IO/MEM/MASTER bits of PCI_COMMAND register after the initial probe/feature negotiation, as it tends to work with a single device at a time at various stages like probing and running block/network bootloaders without doing a full reset in-between. In QEMU, when PCI_COMMAND_MASTER is disabled we disable the corresponding IOMMU memory region, so DMA accesses (including to vring fields like idx/flags) will no longer undergo the necessary translation. Normally we wouldn't expect this to happen since it would be misbehavior on the driver side to continue driving DMA requests. However, in the case of pseries, with iommu_platform=on, we trigger the following sequence when tearing down the virtio-blk dataplane ioeventfd in response to the guest unsetting PCI_COMMAND_MASTER: #2 0x0000555555922651 in virtqueue_map_desc (vdev=vdev@entry=0x555556dbcfb0, p_num_sg=p_num_sg@entry=0x7fffe657e1a8, addr=addr@entry=0x7fffe657e240, iov=iov@entry=0x7fffe6580240, max_num_sg=max_num_sg@entry=1024, is_write=is_write@entry=false, pa=0, sz=0) at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:757 #3 0x0000555555922a89 in virtqueue_pop (vq=vq@entry=0x555556dc8660, sz=sz@entry=184) at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:950 #4 0x00005555558d3eca in virtio_blk_get_request (vq=0x555556dc8660, s=0x555556dbcfb0) at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:255 #5 0x00005555558d3eca in virtio_blk_handle_vq (s=0x555556dbcfb0, vq=0x555556dc8660) at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:776 #6 0x000055555591dd66 in virtio_queue_notify_aio_vq (vq=vq@entry=0x555556dc8660) at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1550 #7 0x000055555591ecef in virtio_queue_notify_aio_vq (vq=0x555556dc8660) at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1546 #8 0x000055555591ecef in virtio_queue_host_notifier_aio_poll (opaque=0x555556dc86c8) at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:2527 #9 0x0000555555d02164 in run_poll_handlers_once (ctx=ctx@entry=0x55555688bfc0, timeout=timeout@entry=0x7fffe65844a8) at /home/mdroth/w/qemu.git/util/aio-posix.c:520 #10 0x0000555555d02d1b in try_poll_mode (timeout=0x7fffe65844a8, ctx=0x55555688bfc0) at /home/mdroth/w/qemu.git/util/aio-posix.c:607 #11 0x0000555555d02d1b in aio_poll (ctx=ctx@entry=0x55555688bfc0, blocking=blocking@entry=true) at /home/mdroth/w/qemu.git/util/aio-posix.c:639 #12 0x0000555555d0004d in aio_wait_bh_oneshot (ctx=0x55555688bfc0, cb=cb@entry=0x5555558d5130 <virtio_blk_data_plane_stop_bh>, opaque=opaque@entry=0x555556de86f0) at /home/mdroth/w/qemu.git/util/aio-wait.c:71 #13 0x00005555558d59bf in virtio_blk_data_plane_stop (vdev=<optimized out>) at /home/mdroth/w/qemu.git/hw/block/dataplane/virtio-blk.c:288 #14 0x0000555555b906a1 in virtio_bus_stop_ioeventfd (bus=bus@entry=0x555556dbcf38) at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:245 #15 0x0000555555b90dbb in virtio_bus_stop_ioeventfd (bus=bus@entry=0x555556dbcf38) at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:237 #16 0x0000555555b92a8e in virtio_pci_stop_ioeventfd (proxy=0x555556db4e40) at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:292 #17 0x0000555555b92a8e in virtio_write_config (pci_dev=0x555556db4e40, address=<optimized out>, val=1048832, len=<optimized out>) at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:613 I.e. the calling code is only scheduling a one-shot BH for virtio_blk_data_plane_stop_bh, but somehow we end up trying to process an additional virtqueue entry before we get there. This is likely due to the following check in virtio_queue_host_notifier_aio_poll: static bool virtio_queue_host_notifier_aio_poll(void opaque) { EventNotifier n = opaque; VirtQueue vq = container_of(n, VirtQueue, host_notifier); bool progress; if (!vq->vring.desc \|\| virtio_queue_empty(vq)) { return false; } progress = virtio_queue_notify_aio_vq(vq); namely the call to virtio_queue_empty(). In this case, since no new requests have actually been issued, shadow_avail_idx == last_avail_idx, so we actually try to access the vring via vring_avail_idx() to get the latest non-shadowed idx: int virtio_queue_empty(VirtQueue vq) { bool empty; ... if (vq->shadow_avail_idx != vq->last_avail_idx) { return 0; } rcu_read_lock(); empty = vring_avail_idx(vq) == vq->last_avail_idx; rcu_read_unlock(); return empty; but since the IOMMU region has been disabled we get a bogus value (0 usually), which causes virtio_queue_empty() to falsely report that there are entries to be processed, which causes errors such as: "virtio: zero sized buffers are not allowed" or "virtio-blk missing headers" and puts the device in an error state. This patch works around the issue by introducing virtio_set_disabled(), which sets a 'disabled' flag to bypass checks like virtio_queue_empty() when bus-mastering is disabled. Since we'd check this flag at all the same sites as vdev->broken, we replace those checks with an inline function which checks for either vdev->broken or vdev->disabled. The 'disabled' flag is only migrated when set, which should be fairly rare, but to maintain migration compatibility we disable it's use for older machine types. Users requiring the use of the flag in conjunction with older machine types can set it explicitly as a virtio-device option. NOTES: - This leaves some other oddities in play, like the fact that DRIVER_OK also gets unset in response to bus-mastering being disabled, but not restored (however the device seems to continue working) - Similarly, we disable the host notifier via virtio_bus_stop_ioeventfd(), which seems to move the handling out of virtio-blk dataplane and back into the main IO thread, and it ends up staying there till a reset (but otherwise continues working normally) Cc: David Gibson <david@gibson.dropbear.id.au>, Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Message-Id: <20191120005003.27035-1-mdroth@linux.vnet.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-01-05 07:03:03 -05:00
Igor Mammedov	fcd3f2cc12	numa: properly check if numa is supported Commit `aa57020774`, by mistake used MachineClass::numa_mem_supported to check if NUMA is supported by machine and also as unrelated change set it to true for sbsa-ref board. Luckily change didn't break machines that support NUMA, as the field is set to true for them. But the field is not intended for checking if NUMA is supported and will be flipped to false within this release for new machine types. Fix it: - by using previously used condition !mc->cpu_index_to_instance_props \|\| !mc->get_default_cpu_node_id the first time and then use MachineState::numa_state down the road to check if NUMA is supported - dropping stray sbsa-ref chunk Fixes: `aa57020774` Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1576154936-178362-3-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-12-19 14:57:14 -03:00
Paolo Bonzini	11bc4a13d1	kvm: convert "-machine kernel_irqchip" to an accelerator property Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:32:46 +01:00
Paolo Bonzini	23b0898e44	kvm: convert "-machine kvm_shadow_mem" to an accelerator property Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:32:27 +01:00
Paolo Bonzini	46472d8232	xen: convert "-machine igd-passthru" to an accelerator property The first machine property to fall is Xen's Intel integrated graphics passthrough. The "-machine igd-passthru" option does not set anymore a property on the machine object, but desugars to a GlobalProperty on accelerator objects. The setter is very simple, since the value ends up in a global variable, so this patch also provides an example before the more complicated cases that follow it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:32:27 +01:00
Paolo Bonzini	6f6e1698a6	vl: configure accelerators from -accel options Drop the "accel" property from MachineState, and instead desugar "-machine accel=" to a list of "-accel" options. This has a semantic change due to removing merge_lists from -accel. For example: - "-accel kvm -accel tcg" all but ignored "-accel kvm". This is a bugfix. - "-accel kvm -accel thread=single" ignored "thread=single", since it applied the option to KVM. Now it fails due to not specifying the accelerator on "-accel thread=single". - "-accel tcg -accel thread=single" chose single-threaded TCG, while now it will fail due to not specifying the accelerator on "-accel thread=single". Also, "-machine accel" and "-accel" become incompatible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:32:26 +01:00
Evgeny Yakovlev	5f2585772f	virtio-blk: advertise F_WCE (F_FLUSH) if F_CONFIG_WCE is advertised Virtio spec 1.1 (and earlier), 5.2.5.2 Driver Requirements: Device Initialization: "Devices SHOULD always offer VIRTIO_BLK_F_FLUSH, and MUST offer it if they offer VIRTIO_BLK_F_CONFIG_WCE" Currently F_CONFIG_WCE and F_WCE are not connected to each other. Qemu will advertise F_CONFIG_WCE if config-wce argument is set for virtio-blk device. And F_WCE is advertised only if underlying block backend actually has it's caching enabled. Fix this by advertising F_WCE if F_CONFIG_WCE is also advertised. To preserve backwards compatibility with newer machine types make this behaviour governed by "x-enable-wce-if-config-wce" virtio-blk-device property and introduce hw_compat_4_2 with new property being off by default for all machine types <= 4.2 (but don't introduce 4.3 machine type itself yet). Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru> Message-Id: <1572978137-189218-1-git-send-email-wrfsh@yandex-team.ru> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-12-13 11:22:06 +00:00
Peter Maydell	a8b5ad8e1f	virtio,vhost: fixes, features, cleanups. FLR support. Misc fixes, cleanups. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJdb6W/AAoJECgfDbjSjVRpRlEIAKvo9Sbq9bOtZ8nhbfJvLBWV nyOk5kgwv+XE+VhYGTsU7poYDPdRQn8uohBzXDb1zzCHd9corHriUXnUQ8TkDdz9 V9v8buK7qRPZa4OddPRVHDPZEn7OBbvNanhbo/Nw8iRcE/XdW+Ezw33A/aR8rSY7 KOxHYHeR2uBzVVDWKxp2yfBd+Zm9gbO27Y1thb9fyi4o7mHZ+gbrFl2p7z3wilNK KuGi0jCmS4I+4h2wmrZXnzSrozg9vJhXxkkdfI7QBze1XiVqC8w/bCcjXGVVGfhe SOvJH9A+yVyWpfjJpgmof4UISah+4zTi9G2SanZ4UERULD/NsiGfLQTVilUijAk= =K61t -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging virtio,vhost: fixes, features, cleanups. FLR support. Misc fixes, cleanups. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Wed 04 Sep 2019 12:53:35 BST # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: libvhost-user: introduce and use vu_has_protocol_feature() libvhost-user: fix SLAVE_SEND_FD handling virtio-pci: Add Function Level Reset support virtio-rng: change default backend to rng-builtin virtio-rng: Keep the default backend out of VirtIORNGConf rng-builtin: add an RNG backend that uses qemu_guest_getrandom() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-09-04 17:22:34 +01:00
Julia Suvorova	eb1556c493	virtio-pci: Add Function Level Reset support Using FLR becomes convenient in cases where resetting the bus is impractical, for example, when debugging the behavior of individual functions. Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20190820163005.1880-1-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-09-04 06:33:10 -04:00
Tao Xu	aa57020774	numa: move numa global variable nb_numa_nodes into MachineState Add struct NumaState in MachineState and move existing numa global nb_numa_nodes(renamed as "num_nodes") into NumaState. And add variable numa_support into MachineClass to decide which submachines support NUMA. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Suggested-by: Igor Mammedov <imammedo@redhat.com> Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20190809065731.9097-3-tao3.xu@intel.com> [ehabkost: include hw/boards.h again to fix build failures] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-09-03 11:26:55 -03:00
Cornelia Huck	9aec2e52ce	hw: add compat machines for 4.2 Add 4.2 machine types for arm/i440fx/q35/s390x/spapr. For i440fx and q35, unversioned cpu models are still translated to -v1, as `0788a56bd1` ("i386: Make unversioned CPU models be aliases") states this should only transition to the latest cpu model version in 4.3 (or later). Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20190724103524.20916-1-cohuck@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 11:32:11 +10:00
Cornelia Huck	bb15791166	compat: disable edid on virtio-gpu base device 'edid' is a property of the virtio-gpu base device, so turning it off on virtio-gpu-pci is not enough (it misses -ccw). Turn it off on the base device instead. Fixes: `0a71966253` ("edid: flip the default to enabled") Signed-off-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20190806115819.16026-1-cohuck@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-08-06 15:45:59 +01:00
Dr. David Alan Gilbert	c8557f1b48	pcie_root_port: Disable ACS on older machines ACS got added in 4.0 unconditionally, that broke older<->4.0 migration where there was a PCIe root port. Fix this by turning it off for 3.1 and older machines; note this fixes compatibility for older QEMUs but breaks compatibility with 4.0 for older machine types. machine type source qemu dest qemu 3.1 3.1 4.0 broken 3.1 3.1 4.1rc2 broken 3.1 3.1 4.1+this OK ++ 3.1 4.0 4.1rc2 OK 3.1 4.0 4.1+this broken -- 4.0 4.0 4.1rc2 OK 4.0 4.0 4.1+this OK So we gain and lose; the consensus seems to be treat this as a fix for older machine types. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190730093719.12958-3-dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-07-30 12:07:07 -04:00
Dr. David Alan Gilbert	dd56040d29	Revert "hw: report invalid disable-legacy\|modern usage for virtio-1-only devs" This reverts commit `f2784eed30` since that accidentally removes the PCIe capabilities from virtio devices because virtio_pci_dc_realize is called before the new 'mode' flag is set. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190729162903.4489-3-dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>	2019-07-29 16:57:27 -04:00
Peter Maydell	032cfe6a79	pl031: Correctly migrate state when using -rtc clock=host The PL031 RTC tracks the difference between the guest RTC and the host RTC using a tick_offset field. For migration, however, we currently always migrate the offset between the guest and the vm_clock, even if the RTC clock is not the same as the vm_clock; this was an attempt to retain migration backwards compatibility. Unfortunately this results in the RTC behaving oddly across a VM state save and restore -- since the VM clock stands still across save-then-restore, regardless of how much real world time has elapsed, the guest RTC ends up out of sync with the host RTC in the restored VM. Fix this by migrating the raw tick_offset. To retain migration compatibility as far as possible, we have a new property migrate-tick-offset; by default this is 'true' and we will migrate the true tick offset in a new subsection; if the incoming data has no subsection we fall back to the old vm_clock-based offset information, so old->new migration compatibility is preserved. For complete new->old migration compatibility, the property is set to 'false' for 4.0 and earlier machine types (this will only affect 'virt-4.0' and below, as none of the other pl031-using machines are versioned). Reported-by: Russell King <rmk@armlinux.org.uk> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-id: 20190709143912.28905-1-peter.maydell@linaro.org	2019-07-15 14:17:04 +01:00
Stefan Hajnoczi	2bbadb08ce	virtio-balloon: fix QEMU 4.0 config size migration incompatibility The virtio-balloon config size changed in QEMU 4.0 even for existing machine types. Migration from QEMU 3.1 to 4.0 can fail in some circumstances with the following error: qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10 read: a1 device: 1 cmask: ff wmask: c0 w1cmask:0 This happens because the virtio-balloon config size affects the VIRTIO Legacy I/O Memory PCI BAR size. Introduce a qdev property called "qemu-4-0-config-size" and enable it only for the QEMU 4.0 machine types. This way <4.0 machine types use the old size, 4.0 uses the larger size, and >4.0 machine types use the appropriate size depending on enabled virtio-balloon features. Live migration to and from old QEMUs to QEMU 4.1 works again as long as a versioned machine type is specified (do not use just "pc"!). Originally-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190710141440.27635-1-stefanha@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-07-12 10:56:26 -04:00

1 2 3 4

154 Commits