Commit Graph

55593 Commits

Author SHA1 Message Date
David Hildenbrand 12e1e8f1aa target/s390x: move cpu_mmu_idx_to_asc() to excp_helper.c
Only used in that file.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-11-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
David Hildenbrand c534055031 target/s390x: move cc_name() to helper.c
While at it, move the translations into the function and properly pass
enum cc_op as parameter. We can't move it to cc_helper.c as this would
break --disable-tcg.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-10-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
David Hildenbrand 1083a3f45c target/s390x: move gtod_*() declarations to s390-virtio.h
The functions are not used in target/s390x/ so a header in hw/s390x/
is a better place.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-9-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
David Hildenbrand e654ca84b6 s390x: drop inclusion of sysemu/kvm.h from some files
s390-stattrib.c needs definition of TARGET_PAGE_SIZE, solve it via cpu.h.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
David Hildenbrand 7d00bf94df s390x/cpumodel: factor out determination of default model name
Now we can drop inclusion of "sysemu/kvm.h" from "s390-virtio.c".

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-7-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
David Hildenbrand fe7cb8eab2 target/s390x: no need to pass kvm_state to savevm_gtod handlers
Let's avoid any KVM stuff in s390-virtio-ccw.c. This parameter is simply
ignored.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-6-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
David Hildenbrand c50f65118b target/s390x: simplify gs_allowed()
No need for kvm_enabled() as this function is only called from KVM and
there is no reason why it shouldn't be allowed for tcg. It is simply not
available under tcg.

Also, there is no need to check for the machine type anymore. Just like
ri_enabled(), we can directly use the stored flag, which results in
"true" for the "none" machine.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-5-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
David Hildenbrand ad4ad5f512 target/s390x: simplify ri_allowed()
Only used in KVM and there is no reason why it shouldn't be allowed for
tcg - it is simply not available.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-4-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
David Hildenbrand 708f99c366 s390x/kvm: drop KVMState parameter from kvm_s390_set_mem_limit()
Not needed at that point. Also drop it from kvm_s390_query_mem_limit()
we call in kvm_s390_set_mem_limit().

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-3-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
David Hildenbrand fba5f6feba s390x/kvm: drop KVMState parameter from s390_get_memslot_count()
Not needed at that point.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-2-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Thomas Huth 574ee06de9 s390x/s390-skeys: Mark the storage key devices with user_creatable = false
QEMU currently aborts if the user tries to create a skey device:

$ s390x-softmmu/qemu-system-s390x -nographic -device s390-skeys-qemu
qemu-system-s390x: hw/s390x/s390-skeys.c:30: s390_get_skeys_device:
 Assertion `ss' failed.
Aborted (core dumped)

The storage key devices are only meant to be instantiated one time,
internally. They can not be used by the user, so mark them with
user_creatable = false.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1503569328-22197-1-git-send-email-thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck 1ecb285dc9 s390x: refine pci dependencies
VIRTIO_PCI should properly depend on CONFIG_PCI.
With this change, we can switch off pci for s390x by removing
'CONFIG_PCI=y' from the default config.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck 42f865da96 s390x/pci: fence off instructions for non-pci
If a guest running on a machine without zpci issues a pci instruction,
throw them an exception.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck 80b7a26536 s390x/sclp: properly guard pci-specific functions
If we do not provide zpci, pci reconfiguration via sclp is not available
either. I/O adapter configuration, however, should always be present.

Rename the values that refer to I/O adapter configuration (instead of only
pci) to make things clearer.

Move length checking of the sccb for I/O adapter configuration into the
common sclp code (out of the pci code). This also fixes an issue that
the pci code would refer to a field in the sccb before checking whether
it was actually long enough.

Check for the adapter type in the sccb and return unrecognized adapter
type if the guest tries to issue I/O adapter configure/deconfigure for
a type other than pci or for pci if the zpci facility is not provided.

Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck d32bd032d8 s390x/ccw: create s390 phb conditionally
Don't create the s390 pci host bridge if we do not provide the zpci
facility.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck 21eb052cf2 s390x/pci: do not advertise pci on non-pci builds
Only set the zpci feature bit on builds that actually support pci.

Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck 1c5deaec77 s390x: chsc nt2 events are pci-only
The nt2 event class is pci-only - don't look for events if pci is
not in the active cpu model.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck 5838d65770 s390x/pci: add stubs
Some non-pci code calls into zpci code. Provide some stubs for builds
without pci.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck 88c725c78e kvm: remove hard dependency on pci
The msi routing code in kvm calls some pci functions: provide
some stubs to enable builds without pci.

Also, to make this more obvious, guard them via a pci_available boolean
(which also can be reused in other places).

Fixes: e1d4fb2de ("kvm-irqchip: x86: add msi route notify fn")
Fixes: 767a554a0 ("kvm-all: Pass requester ID to MSI routing functions")
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck 5f8c92e1d5 9pfs: fix dependencies
Nothing in fsdev/ or hw/9pfs/ depends on pci; it should rather depend
on CONFIG_VIRTFS and CONFIG_VIRTIO/CONFIG_XEN only.

Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Christian Borntraeger e9a3591fa0 configure: enable --s390-pgste linker option
KVM guests on s390 need a different page table layout than normal
processes (2kb page table + 2kb page status extensions vs 2kb page table
only). As of today this has to be enabled via the vm.allocate_pgste
sysctl.

Newer kernels (>= 4.12) on s390 check for an S390_PGSTE program header
and enable the pgste page table extensions in that case. This makes the
vm.allocate_pgste sysctl unnecessary. We enable this program header for
the s390 system emulation (qemu-system-s390x) if we build on s390
- for s390 system emulation
- the linker supports --s390-pgste (binutils >= 2.29)
- KVM is enabled

This will allow distributions to disable the global vm.allocate_pgste
sysctl, which will improve the page table allocation for non KVM
processes as only 2kb chunks are necessary.

Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Dan Horak <dhorak@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1503483383-199649-1-git-send-email-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck eb569af835 s390x: wire up diag288 in tcg
Make the diag288 watchdog useable via tcg as well.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Thomas Huth 84ebd3e8c7 watchdog/wdt_diag288: Mark diag288 watchdog as non-hotpluggable
QEMU currently aborts when the user tries to hot-unplug a diag288
device:

$ qemu-system-s390x -nographic -nodefaults -S -monitor stdio
QEMU 2.9.92 monitor - type 'help' for more information
(qemu) device_add diag288,id=x
(qemu) device_del x
**
ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)

The device is not designed as hot-pluggable (it should only be used
via the "-watchdog" parameter), so let's simply remove the possibility
to hotplug it to prevent that users can run into this ugly situation.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1502892528-22618-1-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck a8aec856b8 s390x/tcg: specification exception for unknown diag
While the PoP is silent on the issue, z/VM documentation states
that unknown diagnose codes trigger a specification exception.
We already do that when running with kvm, so change tcg to do so
as well.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Thomas Huth ea5bef49ea tests: Add network filter tests to the check-qtest-s390x list
With some small modifications, we can also use the the netfilter,
the filter-mirror and the filter-redirector tests on s390x.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <1502951113-4246-3-git-send-email-thuth@redhat.com>
Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Thomas Huth 7f57d58d6a tests: Run filter-redirector and -mirror test only on POSIX systems
This way we can get rid of the ugly #ifdefs in the code which makes
it easier to extend later.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <1502951113-4246-2-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Thomas Huth b1b2feac94 tests/pxe: Check virtio-net-ccw on s390x
Now that we've got a firmware that can do TFTP booting on s390x (i.e.
the pc-bios/s390-netboot.img), we can enable the PXE tester for this
architecture, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <1502431076-22849-3-git-send-email-thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Thomas Huth 83898cce62 tests/boot-sector: Do not overwrite the x86 buffer on other architectures
Re-using the boot_sector code buffer from x86 for other architectures
is not very nice, especially if we add more architectures later. It's
also ugly that the test uses a huge pre-initialized array at all - the
size of the executable is very huge due to this array. So let's use a
separate buffer for each architecture instead, allocated from the heap,
so that we really just use the memory that we need.

Suggested-by: Michael Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <1502431076-22849-2-git-send-email-thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Thomas Huth 0d4fa4996f s390x/ipl: The s390-ipl device is not hot-pluggable
The s390-ipl device can not be created by the user, since it is meant only
to  be instantiated once internally to load the ROMs and kernel. If the user
tries to do a "device_add s390-ipl" via the monitor later, QEMU aborts with
a "ROM images must be loaded at startup" error message.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1502861458-30270-1-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Cornelia Huck 70d8d9a0c9 s390x: introduce 2.11 compat machine
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Dong Jia Shi 5c8d6f008c s390x/css: generate solicited crw for rchp completion signaling
A successful completion of rchp should signal a solicited channel path
initialized CRW (channel report word), while the current implementation
always generates an un-solicited one. Let's fix this.

Reported-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20170803003527.86979-3-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Dong Jia Shi 808e668bbc s390x/css: use macro for event-information pending error recover code
Let's use a macro for the ERC (error recover code) when generating a
Channel Subsystem Event-information pending CRW (channel report word).

While we are at it, let's also add all other ERCs.

Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20170803003527.86979-2-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-08-30 18:23:25 +02:00
Peter Maydell 1ab5eb4efb Update version for v2.10.0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-08-30 17:02:54 +01:00
Fred Rolland fb9429ddc1 qemu-doc: Add UUID support in initiator name
Update doc with the usage of UUID for initiator name.

Related-To: https://bugzilla.redhat.com/1006468
Signed-off-by: Fred Rolland <frolland@redhat.com>
Message-id: 20170823084830.30500-1-frolland@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-30 13:47:53 +01:00
Stefan Hajnoczi 0ea47d0f36 tests: migration/guestperf Python 2.6 argparse compatibility
Add the scripts/ directory to sys.path so Python 2.6 will be able to
import argparse.

Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Message-id: 20170825155732.15665-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-30 12:02:11 +01:00
Stefan Hajnoczi c2d3189667 docker.py: Python 2.6 argparse compatibility
Add the scripts/ directory to sys.path so Python 2.6 will be able to
import argparse.

Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Message-id: 20170825155732.15665-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-30 12:02:11 +01:00
Stefan Hajnoczi 47e1cb1f0a scripts: add argparse module for Python 2.6 compatibility
The minimum Python version supported by QEMU is 2.6.  The argparse
standard library module was only added in Python 2.7.  Many scripts
would like to use argparse because it supports command-line
sub-commands.

This patch adds argparse.  See the top of argparse.py for details.

Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-id: 20170825155732.15665-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-30 12:02:11 +01:00
Alberto Garcia c3a8fe331e misc: Remove unused Error variables
There's a few cases which we're passing an Error pointer to a function
only to discard it immediately afterwards without checking it. In
these cases we can simply remove the variable and pass NULL instead.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170829120836.16091-1-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-30 11:58:26 +01:00
Eduardo Habkost e916a6e88a oslib-posix: Print errors before aborting on qemu_alloc_stack()
If QEMU is running on a system that's out of memory and mmap()
fails, QEMU aborts with no error message at all, making it hard
to debug the reason for the failure.

Add perror() calls that will print error information before
aborting.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170829212053.6003-1-ehabkost@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-30 09:33:49 +01:00
Alberto Garcia d942feec97 throttle: Test the valid range of config values
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: a57dd6274e1b6dc9c28769fec4c7ea543be5c5e3.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29 16:54:45 +01:00
Alberto Garcia 67335a4558 throttle: Make burst_length 64bit and add range checks
LeakyBucket.burst_length is defined as an unsigned integer but the
code never checks for overflows and it only makes sure that the value
is not 0.

In practice this means that the user can set something like
throttling.iops-total-max-length=4294967300 despite being larger than
UINT_MAX and the final value after casting to unsigned int will be 4.

This patch changes the data type to uint64_t. This does not increase
the storage size of LeakyBucket, and allows us to assign the value
directly from qemu_opt_get_number() or BlockIOThrottle and then do the
checks directly in throttle_is_valid().

The value of burst_length does not have a specific upper limit,
but since the bucket size is defined by max * burst_length we have
to prevent overflows. Instead of going for UINT64_MAX or something
similar this patch reuses THROTTLE_VALUE_MAX, which allows I/O bursts
of 1 GiB/s for 10 days in a row.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1b2e3049803f71cafb2e1fa1be4fb47147a0d398.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29 16:54:45 +01:00
Alberto Garcia d00e6923b1 throttle: Make LeakyBucket.avg and LeakyBucket.max integer types
Both the throttling limits set with the throttling.iops-* and
throttling.bps-* options and their QMP equivalents defined in the
BlockIOThrottle struct are integer values.

Those limits are also reported in the BlockDeviceInfo struct and they
are integers there as well.

Therefore there's no reason to store them internally as double and do
the conversion everytime we're setting or querying them, so this patch
uses uint64_t for those types. Let's also use an unsigned type because
we don't allow negative values anyway.

LeakyBucket.level and LeakyBucket.burst_level do however remain double
because their value changes depending on the fraction of time elapsed
since the previous I/O operation.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: f29b840422767b5be2c41c2dfdbbbf6c5f8fedf8.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29 16:54:45 +01:00
Alberto Garcia 2a8be39eba throttle: Remove throttle_fix_bucket() / throttle_unfix_bucket()
The throttling code can change internally the value of bkt->max if it
hasn't been set by the user. The problem with this is that if we want
to retrieve the original value we have to undo this change first. This
is ugly and unnecessary: this patch removes the throttle_fix_bucket()
and throttle_unfix_bucket() functions completely and moves the logic
to throttle_compute_wait().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Message-id: 5b0b9e1ac6eb208d709eddc7b09e7669a523bff3.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29 16:54:45 +01:00
Alberto Garcia fa36f1b2eb throttle: Make throttle_is_valid() a bit less verbose
Use a pointer to the bucket instead of repeating cfg->buckets[i] all
the time. This makes the code more concise and will help us expand the
checks later and save a few line breaks.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 763ffc40a26b17d54cf93f5a999e4656049fcf0c.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29 16:54:45 +01:00
Alberto Garcia 0770a7a646 throttle: Update the throttle_fix_bucket() documentation
The way the throttling algorithm works is that requests start being
throttled once the bucket level exceeds the burst limit. When we get
there the bucket leaks at the level set by the user (bkt->avg), and
that leak rate is what prevents guest I/O from exceeding the desired
limit.

If we don't allow bursts (i.e. bkt->max == 0) then we can start
throttling requests immediately. The problem with keeping the
threshold at 0 is that it only allows one request at a time, and as
soon as there's a bit of I/O from the guest every other request will
be throttled and performance will suffer considerably. That can even
make the guest unable to reach the throttle limit if that limit is
high enough, and that happens regardless of the block scheduler used
by the guest.

Increasing that threshold gives flexibility to the guest, allowing it
to perform short bursts of I/O before being throttled. Increasing the
threshold too much does not make a difference in the long run (because
it's the leak rate what defines the actual throughput) but it does
allow the guest to perform longer initial bursts and exceed the
throttle limit for a short while.

A burst value of bkt->avg / 10 allows the guest to perform 100ms'
worth of I/O at the target rate without being throttled.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 31aae6645f0d1fbf3860fb2b528b757236f0c0a7.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29 16:54:44 +01:00
Alberto Garcia faa8215c17 throttle: Fix wrong variable name in the header documentation
The level of the burst bucket is stored in bkt.burst_level, not
bkt.burst_length.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Message-id: 49aab2711d02f285567f3b3b13a113847af33812.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29 16:54:44 +01:00
Dan Aloni cdd346371e nvme: Fix get/set number of queues feature, again
The number of queues that should be return by the admin command should:

  1) Only mention the number of non-admin queues.
  2) It is zero-based, meaning that '0 == one non-admin queue',
     '1 == two non-admin queues', and so forth.

Because our `num_queues` means the number of queues _plus_ the admin
queue, then the right calculation for the number returned from the admin
command is `num_queues - 2`, combining the two requirements mentioned.

The issue was discovered by reducing num_queues from 64 to 8 and running
a Linux VM with an SMP parameter larger than that (e.g. 22). It tries to
utilize all queues, and therefore fails with an invalid queue number
when trying to queue I/Os on the last queue.

Signed-off-by: Dan Aloni <dan@kernelim.com>
CC: Alex Friedman <alex@e8storage.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29 16:54:40 +01:00
Peter Maydell 248b237356 Update version for v2.10.0-rc4 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-08-24 17:34:26 +01:00
Peter Maydell 1eed33994e nbd patches for 2017-08-23
- Fam Zheng: 0/4 block: Fix non-shared storage migration
 - Stefan Hajnoczi: qemu-iotests: add 194 non-shared storage migration test
 - Stefan Hajnoczi: nbd-client: avoid spurious qio_channel_yield() re-entry
 -----BEGIN PGP SIGNATURE-----
 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg
 
 iQEcBAABCAAGBQJZnavdAAoJEKeha0olJ0NqtuEH/javGzZzPu7vsxwpifJkP8mD
 W4WMKmFL2a+RDA7zd31w/6yPXIAviHwa4ahsEvxysmn3TpxOSnLOR8qKAkkjJH6E
 duERzk+270rt/7lB5gtalyw2AROfps6NjbYZeo7SLg9vB9DRQX7PUG8ob0DbNWK4
 vJrrai57VODprGUGRaiOBxShP9nWTaauQbyhu+S13vhUASYb7XqtTJEfpcM5XwqG
 iH5jLPAkXQNv3Ltux4P0MPEuP8zkrgcUs3p6L7+s0h76IJsT8YOmTwJQG//HxxRo
 LcWb/hGyr1NkfWk4Im8f/Dh7dxlEdotGRdd/zxrEEXtBm/WmoM7jjAZe/Con7jQ=
 =sJy0
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2017-08-23' into staging

nbd patches for 2017-08-23

- Fam Zheng: 0/4 block: Fix non-shared storage migration
- Stefan Hajnoczi: qemu-iotests: add 194 non-shared storage migration test
- Stefan Hajnoczi: nbd-client: avoid spurious qio_channel_yield() re-entry

# gpg: Signature made Wed 23 Aug 2017 17:22:53 BST
# gpg:                using RSA key 0xA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg:                 aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-nbd-2017-08-23:
  nbd-client: avoid spurious qio_channel_yield() re-entry
  qemu-iotests: add 194 non-shared storage migration test
  block: Update open_flags after ->inactivate() callback
  mirror: Mark target BB as "force allow inactivate"
  block-backend: Allow more "can inactivate" cases
  block-backend: Refactor inactivate check

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-08-23 17:38:01 +01:00
Stefan Hajnoczi 40f4a21895 nbd-client: avoid spurious qio_channel_yield() re-entry
The following scenario leads to an assertion failure in
qio_channel_yield():

1. Request coroutine calls qio_channel_yield() successfully when sending
   would block on the socket.  It is now yielded.
2. nbd_read_reply_entry() calls nbd_recv_coroutines_enter_all() because
   nbd_receive_reply() failed.
3. Request coroutine is entered and returns from qio_channel_yield().
   Note that the socket fd handler has not fired yet so
   ioc->write_coroutine is still set.
4. Request coroutine attempts to send the request body with nbd_rwv()
   but the socket would still block.  qio_channel_yield() is called
   again and assert(!ioc->write_coroutine) is hit.

The problem is that nbd_read_reply_entry() does not distinguish between
request coroutines that are waiting to receive a reply and those that
are not.

This patch adds a per-request bool receiving flag so
nbd_read_reply_entry() can avoid spurious aio_wake() calls.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170822125113.5025-1-stefanha@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-23 11:22:15 -05:00