When trying to print data to the pty, we first check if it is connected.
If not, we try to reconnect, but we drop the pending data even if we
have successfully reconnected; this makes us lose the first byte of the very
first transmission.
This small fix addresses the issue by checking once more if the pty is connected
after having tried to reconnect.
Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Related spice-only bug. We have a fixed 16 MB buffer here, being
presented to the spice-server as qxl video memory in case spice is
used with a non-qxl card. It's also used with qxl in vga mode.
When using display resolutions requiring more than 16 MB of memory we
are going to overflow that buffer. In theory the guest can write,
indirectly via spice-server. The spice-server clears the memory after
setting a new video mode though, triggering a segfault in the overflow
case, so qemu crashes before the guest has a chance to do something
evil.
Fix that by switching to dynamic allocation for the buffer.
CVE-2014-3615
Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
* remotes/kvaneesh/for-upstream:
hw/9pfs: Don't return type from host in readdir on local 9p filesystem
hw/9pfs: Use little-endian format for xattr values
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
At present, this function doesn't have partial cleanup implemented,
which will cause resource leaks in some scenarios.
Example:
1. Assume that "dc->realize(dev, &local_err)" executes successful
and local_err == NULL;
2. device hotplug in hotplug_handler_plug() executes but fails
(it is prone to occur). Then local_err != NULL;
3. error_propagate(errp, local_err) and return. But the resources
which have been allocated in dc->realize() will be leaked.
Simple backtrace:
dc->realize()
|->device_realize
|->pci_qdev_init()
|->do_pci_register_device()
|->etc.
Add fuller cleanup logic which assures that function can
goto appropriate error label as local_err population is
detected at each relevant point.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Forcefully unrealize all children regardless of errors in earlier
iterations (if any). We should keep going with cleanup operation
rather than report an error immediately. Therefore store the first
child unrealization failure and propagate it at the end. We also
forcefully unregister vmsd and unrealize actual object, too.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUCJQsAAoJEJykq7OBq3PIFs4H/jHdJ65oXUeS8REtDwsRaU/q
Ftny6suH0j8XYh/zFSppNFHprX/i2AB7oJpHS8MzVjglxQ06OT/BQWSb2NA99URD
PARU0/Ijn2ZgReCiMS3qBGotYLJV/pJsZRtmi6xc/v9Zz/LlziBo1J/ZsZeMkhiP
RL/Q5ySixyWGx32989YcTmn98aCc4nvG70pE3dz3I3PPYQtUn38uqTltYPORaOgy
txhIOxeyvwgL+jwYvoJq5UgDpOw/QNtLRzN0+YydRUs5ad7roSlRX4PvlBgXxfWc
NPxt/wM+OPEyN029KLV8IjVNvxxM/QRNFqksabnmJIS/SgBaiSRPHZuHR5po8C4=
=cCXt
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging
Net patches
# gpg: Signature made Thu 04 Sep 2014 17:32:44 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/net-pull-request:
virtio-net: purge outstanding packets when starting vhost
net: complete all queued packets on VM stop
net: invoke callback when purging queue
virtio: don't call device on !vm_running
virtio-net: don't run bh on vm stopped
net: Forbid dealing with packets when VM is not running
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
whenever we start vhost, virtio could have outstanding packets
queued, when they complete later we'll modify the ring
while vhost is processing it.
To prevent this, purge outstanding packets on vhost start.
Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This completes all packets, ensuring that callbacks
will not run when VM is stopped.
Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
devices rely on packet callbacks eventually running,
but we violate this rule whenever we purge the queue.
To fix, invoke callbacks on all packets on purge.
Set length to 0, this way callers can detect that
this happened and re-queue if necessary.
Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
On vm stop, virtio changes vm_running state
too soon, so callbacks can get envoked with
vm_running = false;
Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
commit 783e770693
virtio-net: stop/start bh when appropriate
is incomplete: BH might execute within the same main loop iteration but
after vmstop, so in theory, we might trigger an assertion.
I was unable to reproduce this in practice,
but it seems clear enough that the potential is there, so worth fixing.
Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When using mapped mode in 9pfs, readdir implementation
should not return file type in d_type from the host
readdir, instead, it should use the type stored in
the extended attributes. Since d_type is optional
and reading ext attrs for every readdir is expensive,
it should be sufficient to just set d_type to DT_UNKNOWN,
so guest will know to look it up separately.
This is a -stable material.
Signed-off-by: Bastian Blank <waldi@debian.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
This error can not happen normally. If it happens, it indicates
something very wrong, we should abort QEMU. Moreover, the
user can only refer to /machine/peripheral or /objects, not
/machine/unattached.
While at it, remove superfluous check about local_err.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Obsoleted by automatic object_property_add() arrayification.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
If "[*]" is given as the last part of a QOM property name, treat that
as an array property. The added property is given the first available
name, replacing the * with a decimal number counting from 0.
First add with name "foo[*]" will be "foo[0]". Second "foo[1]" and so
on.
Callers may inspect the ObjectProperty * return value to see what
number the added property was given.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Since commit c4090f8, -object options are no longer handled through
object_set_property(), so clean up -object leftovers by renaming the
function and dropping special-casing of qom-type and id properties.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Current object_child_foreach() uses QTAILQ_FOREACH() to walk
through children and that makes children removal from the callback
impossible.
This makes object_child_foreach() use QTAILQ_FOREACH_SAFE().
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
For all NICs(except virtio-net) emulated by qemu,
Such as e1000, rtl8139, pcnet and ne2k_pci,
Qemu can still receive packets when VM is not running.
If this happened in *migration's* last PAUSE VM stage, but
before the end of the migration, the new receiving packets will possibly dirty
parts of RAM which has been cached in *iovec*(will be sent asynchronously) and
dirty parts of new RAM which will be missed.
This will lead serious network fault in VM.
To avoid this, we forbid receiving packets in generic net code when
VM is not running.
Bug reproduction steps:
(1) Start a VM which configured at least one NIC
(2) In VM, open several Terminal and do *Ping IP -i 0.1*
(3) Migrate the VM repeatedly between two Hosts
And the *PING* command in VM will very likely fail with message:
'Destination HOST Unreachable', the NIC in VM will stay unavailable unless you
run 'service network restart'
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A bunch of bugfixes - these will make sense for 2.1.1
Initial Intel IOMMU support.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUBxqDAAoJECgfDbjSjVRpAlUH+weaxN0pABkoplJ1OVFUH0wD
yBzIujvmSLTmur0i6uLjUJ+7g2+LkPdx+L4zYz8Z5hSaF9Xji6j2ZntMxpoCiDSz
A6jQup1vwjEEbuJWV9mUjsRN6D6+t1xQTT899tMAnVUDZtv/o81nDtjcFp4/7P5U
7SyiR/Lc3cbeTjKqROuyNItmohV9qo/Zts5Xa3zEJ0LaLoXwokwEBIg9C0Xioot8
dxhe3s8suMtipPiog2gpgDLXkqO5PrG9ggL02dNZaNsUdu+0ZVnFbBBwm+dF9Siw
LJRkT102lVABnnm54MLztD8ynAUQO9QzjQAGmnh2YC72AvEREijZ7/hfuImJaUc=
=7F5u
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc fixes, features
A bunch of bugfixes - these will make sense for 2.1.1
Initial Intel IOMMU support.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Wed 03 Sep 2014 14:41:23 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
acpi-build: Set FORCE_APIC_CLUSTER_MODEL bit for FADT flags
vhost-scsi: init backend features earlier
vhost_net: init acked_features to backend_features
vhost_net: start/stop guest notifiers properly
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This reverts commit aad4dce934.
I accidentally merged the wrong version of a pull request
which had a buggy version of this patch. Reverting the
buggy version means we can then cleanly merge in the correct
pull with the corrected change.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Plug a bunch of holes in the bochs dispi interface parameter checking.
Add a function doing verification on all registers. Call that
unconditionally on every register write. That way we should catch
everything, even changing one register affecting the valid range of
another register.
Some of the holes have been added by commit
e9c6149f6a. Before that commit the
maximum possible framebuffer (VBE_DISPI_MAX_XRES * VBE_DISPI_MAX_YRES *
32 bpp) has been smaller than the qemu vga memory (8MB) and the checking
for VBE_DISPI_MAX_XRES + VBE_DISPI_MAX_YRES + VBE_DISPI_MAX_BPP was ok.
Some of the holes have been there forever, such as
VBE_DISPI_INDEX_X_OFFSET and VBE_DISPI_INDEX_Y_OFFSET register writes
lacking any verification.
Security impact:
(1) Guest can make the ui (gtk/vnc/...) use memory rages outside the vga
frame buffer as source -> host memory leak. Memory isn't leaked to
the guest but to the vnc client though.
(2) Qemu will segfault in case the memory range happens to include
unmapped areas -> Guest can DoS itself.
The guest can not modify host memory, so I don't think this can be used
by the guest to escape.
CVE-2014-3615
Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
VgaState->vram_size is the size of the pci bar. In case of qxl not the
whole pci bar can be used as vga framebuffer. Add a new variable
vbe_size to handle that case. By default (if unset) it equals
vram_size, but qxl can set vbe_size to something else.
This makes sure VBE_DISPI_INDEX_VIDEO_MEMORY_64K returns correct results
and sanity checks are done with the correct size too.
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
If we start Windows 2008 R2 DataCenter with number of cpu less than 8,
The system will use APIC Flat Logical destination mode as default configuration,
Which has an upper limit of 8 CPUs.
The fault is that VM can not show all processors within Task Manager if
we hot-add cpus when the number of cpus in VM extends the limit of 8.
If we use cluster destination model, the problem will be solved.
Note:
This flag was introduced later than ACPI v1.0 specification while QEMU
generates v1.0 tables only, but...
linux kernel ignores this flag, so patch has no influence on it.
Tested with Win[XPsp3|Srv2003EE|Srv2008DC|Srv2008R2|Srv2012R2], there
isn't BSODs and guests boot just fine. In cases guest doesn't support
cpu-hotplug, cpu becomes visible after reboot and in case the guest
supports cpu-hotplug, it works as expected with this patch.
Cc: qemu-stable@nongnu.org
Signed-off-by: huangzhichao <huangzhichao@huawei.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
As vhost core can use backend_features during init, clear it earlier to
avoid using uninitialized memory.
This use would be harmless since vhost scsi ignores the result
anyway, but initializing earlier will help prevent valgrind errors,
and make scsi and net behave similarly.
Cc: qemu-stable@nongnu.org
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
commit 2e6d46d77e (vhost: add
vhost_get_features and vhost_ack_features) removes the step that
initializes the acked_features to backend_features.
As this field is now uninitialized, vhost initialization will sometimes
fail.
To fix, initialize acked_features on each ack.
Tested-by: Andrey Korolyov <andrey@xdel.ru>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
commit a9f98bb5eb "vhost: multiqueue
support" changed the order of stopping the device. Previously
vhost_dev_stop would disable backend and only afterwards, unset guest
notifiers. We now unset guest notifiers while vhost is still
active. This can lose interrupts causing guest networking to fail. In
particular, this has been observed during migration.
To fix this, several other changes are needed:
- remove the hdev->started assertion in vhost.c since we may want to
start the guest notifiers before vhost starts and stop the guest
notifiers after vhost is stopped.
- introduce the vhost_net_set_vq_index() and call it before setting
guest notifiers. This is to guarantee vhost_net has the correct
virtqueue index when setting guest notifiers.
MST: fix up error handling.
Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Andrey Korolyov <andrey@xdel.ru>
Reported-by: "Zhangjie (HZ)" <zhangjie14@huawei.com>
Tested-by: William Dauchy <william@gandi.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
With security_model=mapped-xattr, we encode the uid,gid and other file
attributes as extended attributes of the file. We save them under
user.virtfs.* namespace.
Use little-endian encoding for on-disk values. This enables us to export
the same directory from both little-endian and big-endian hosts.
NOTE: This will break big-endian host that have virtFS exports
using security model mapped-xattr. They will have to use external tools
to convert the xattr to little-endian format.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
The hostfwd_add and hostfwd_remove monitor commands allow the user
to optionally specify a vlan/stack tuple. hostfwd_add honours this,
but hostfwd_remove does not (it looks up the tuple but then ignores
the SlirpState it has looked up and always uses the first stack
in the list anyway). Correct this to honour what the user requested.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
the memdev_list in hmp_info_memdev() is never freed.
so we use existent method qapi_free_MemdevList() to free it.
and also we can use qapi_free_MemdevList() to replace list loops
to clean up the memdev list in error path.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
string_output_get_string() uses g_string_free(str, false) to
transfer the 'str' pointer to callers and never free it.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Li Liu <john.liuli@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is a dummy file with no user, drop it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Steps:
1.enable qemu debug print, using simply scprit as below:
grep "//#define DEBUG" * -rl | xargs sed -i "s/\/\/#define DEBUG/#define DEBUG/g"
2. make -j
3. get some warning:
hw/i2c/pm_smbus.c: In function 'smb_ioport_writeb':
hw/i2c/pm_smbus.c:142: warning: format '%04x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/i2c/pm_smbus.c:142: warning: format '%02x' expects type 'unsigned int', but argument 3 has type 'uint64_t'
hw/i2c/pm_smbus.c: In function 'smb_ioport_readb':
hw/i2c/pm_smbus.c:209: warning: format '%04x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/intc/i8259.c: In function 'pic_ioport_read':
hw/intc/i8259.c:373: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/input/pckbd.c: In function 'kbd_write_command':
hw/input/pckbd.c:232: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'uint64_t'
hw/input/pckbd.c: In function 'kbd_write_data':
hw/input/pckbd.c:333: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'uint64_t'
hw/isa/apm.c: In function 'apm_ioport_writeb':
hw/isa/apm.c:44: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/isa/apm.c:44: warning: format '%02x' expects type 'unsigned int', but argument 3 has type 'uint64_t'
hw/isa/apm.c: In function 'apm_ioport_readb':
hw/isa/apm.c:67: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'hwaddr'
hw/timer/mc146818rtc.c: In function 'cmos_ioport_write':
hw/timer/mc146818rtc.c:394: warning: format '%02x' expects type 'unsigned int', but argument 3 has type 'uint64_t'
hw/i386/pc.c: In function 'port92_write':
hw/i386/pc.c:479: warning: format '%02x' expects type 'unsigned int', but argument 2 has type 'uint64_t'
Fix them.
Cc: qemu-trivial@nongnu.org
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
A bunch of bugfixes - these will make sense for 2.1.1
Initial Intel IOMMU support.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUBdygAAoJECgfDbjSjVRpa9cIAJS06we0CpJaVmPrQS5HvC1w
An5Y5bGdfMQtfKjqN1Kehmtu/+wjNKZJw427+6B+KNO7wm9rRUiu927qp9lNGlbH
g3ybrknKYeyqVO/43SJt8c1eODSkmNgHPqyCkRVLbriYo850b2HhjJyMvVNZqeHD
zuTmU95GTNeiYAV8J1c59OrqUz302kCXI4A47loY7LdoEFMbJat4DbkrkspuTgbQ
EVk5sR8p2atKzgaOV6M6yiAtL5uSBNr9KmHvuA7ZBiV21wmOJm5u3y6DpLczUD90
+Ln6BCjmPS5GQ12pzY7U65enr/x/RYo6k01ig9MP3TndNA02XxCaskqfd083jM8=
=4drK
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc fixes, features
A bunch of bugfixes - these will make sense for 2.1.1
Initial Intel IOMMU support.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 02 Sep 2014 16:05:04 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
vhost_net: start/stop guest notifiers properly
pci: avoid losing config updates to MSI/MSIX cap regs
virtio-net: don't run bh on vm stopped
ioh3420: remove unused ioh3420_init() declaration
vhost_net: cleanup start/stop condition
intel-iommu: add IOTLB using hash table
intel-iommu: add context-cache to cache context-entry
intel-iommu: add supports for queued invalidation interface
intel-iommu: fix coding style issues around in q35.c and machine.c
intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch
intel-iommu: add DMAR table to ACPI tables
intel-iommu: introduce Intel IOMMU (VT-d) emulation
iommu: add is_write as a parameter to the translate function of MemoryRegionIOMMUOps
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
commit a9f98bb5eb vhost: multiqueue
support changed the order of stopping the device. Previously
vhost_dev_stop would disable backend and only afterwards, unset guest
notifiers. We now unset guest notifiers while vhost is still
active. This can lose interrupts causing guest networking to fail. In
particular, this has been observed during migration.
To adapt this, several other changes are needed:
- remove the hdev->started assertion in vhost.c since we may want to
start the guest notifiers before vhost starts and stop the guest
notifiers after vhost is stopped.
- introduce the vhost_net_set_vq_index() and call it before setting
guest notifiers. This is used to guarantee vhost_net has the correct
virtqueue index when setting guest notifiers.
Cc: qemu-stable@nongnu.org
Reported-by: "Zhangjie (HZ)" <zhangjie14@huawei.com>
Tested-by: William Dauchy <wdauchy@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since
commit 95d6580024
msi: Invoke msi/msix_write_config from PCI core
msix config writes are lost, the value written is always 0.
Fix pci_default_write_config to avoid this.
Cc: qemu-stable@nongnu.org
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
commit 783e770693
virtio-net: stop/start bh when appropriate
is incomplete: BH might execute within the same main loop iteration but
after vmstop, so in theory, we might trigger an assertion.
I was unable to reproduce this in practice,
but it seems clear enough that the potential is there, so worth fixing.
Cc: qemu-stable@nongnu.org
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
commit 0f9b1771cc
ioh3420: Remove obsoleted, unused ioh3420_init function
removed the implementation of ioh3420_init
Drop the declaration from the header file as well.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Checking vhost device internal state in vhost_net looks like
a layering violation since vhost_net does not
set this flag: it is set and tested by vhost.c.
There seems to be no reason to check this:
caller in virtio net uses its own flag,
vhost_started, to ensure vhost is started/stopped
as appropriate.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
QEMU system mode page table walks are expensive. Taken by running QEMU
qemu-system-x86_64 system mode on Intel PIN , a TLB miss and walking a
4-level page tables in guest Linux OS takes ~450 X86 instructions on
average.
QEMU system mode TLB is implemented using a directly-mapped hashtable.
This structure suffers from conflict misses. Increasing the
associativity of the TLB may not be the solution to conflict misses as
all the ways may have to be walked in serial.
A victim TLB is a TLB used to hold translations evicted from the
primary TLB upon replacement. The victim TLB lies between the main TLB
and its refill path. Victim TLB is of greater associativity (fully
associative in this patch). It takes longer to lookup the victim TLB,
but its likely better than a full page table walk. The memory
translation path is changed as follows :
Before Victim TLB:
1. Inline TLB lookup
2. Exit code cache on TLB miss.
3. Check for unaligned, IO accesses
4. TLB refill.
5. Do the memory access.
6. Return to code cache.
After Victim TLB:
1. Inline TLB lookup
2. Exit code cache on TLB miss.
3. Check for unaligned, IO accesses
4. Victim TLB lookup.
5. If victim TLB misses, TLB refill
6. Do the memory access.
7. Return to code cache
The advantage is that victim TLB can offer more associativity to a
directly mapped TLB and thus potentially fewer page table walks while
still keeping the time taken to flush within reasonable limits.
However, placing a victim TLB before the refill path increase TLB
refill path as the victim TLB is consulted before the TLB refill. The
performance results demonstrate that the pros outweigh the cons.
some performance results taken on SPECINT2006 train
datasets and kernel boot and qemu configure script on an
Intel(R) Xeon(R) CPU E5620 @ 2.40GHz Linux machine are shown in the
Google Doc link below.
https://docs.google.com/spreadsheets/d/1eiItzekZwNQOal_h-5iJmC4tMDi051m9qidi5_nwvH4/edit?usp=sharing
In summary, victim TLB improves the performance of qemu-system-x86_64 by
11% on average on SPECINT2006, kernelboot and qemu configscript and with
highest improvement of in 26% in 456.hmmer. And victim TLB does not result
in any performance degradation in any of the measured benchmarks. Furthermore,
the implemented victim TLB is architecture independent and is expected to
benefit other architectures in QEMU as well.
Although there are measurement fluctuations, the performance
improvement is very significant and by no means in the range of
noises.
Signed-off-by: Xin Tong <trent.tong@gmail.com>
Message-id: 1407202523-23553-1-git-send-email-trent.tong@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>