Introduce the QOM realizefn suggested by Anthony.
Detailed documentation is supplied in the qdev header.
For now this implements a default DeviceClass::realize callback that
just wraps DeviceClass::init, which it deprecates.
Once all devices have been converted to DeviceClass::realize,
DeviceClass::init is to be removed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Whether the device was initialized or not is QOM-level information and
currently unused. Drop it from device. This leaves the boolean state of
whether or not DeviceClass::init was called or not, a.k.a. "realized".
Suggested-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch removes the default boot order for pseries machine. This allows
the machine to handle a NULL boot order in case no -boot option is provided.
Thus it helps SLOF firmware to verify if boot order is specified in command
line or not. If no boot order is provided SLOF tries to boot from the
device set in the nvram.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avik Sil <aviksil@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch makes default boot order machine specific instead of
set globally. The default boot order can be set per machine in
QEMUMachine boot_order. This also allows a machine to receive a
NULL boot order when -boot isn't used and take an appropriate action
accordingly. This helps machine boots from the devices as set in
guest's non-volatile memory location in case no boot order is
provided by the user.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avik Sil <aviksil@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
It leaks memory and fails to adjust qemu_acl member nentries. Future
acl_add become confused: can misreport the position, and can silently
fail to add.
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Careless use of malloc(): allocate Uint32[N], assign to int *, use
int[N].
Fix by converting to g_new().
Functions can't fail anymore, so make them return void. Caller
ignored the value anyway.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
ppc64 build needs this stub to build with virtio enabled.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* afaerber/memory-ioport:
acpi_piix4: Do not use old_portio-style callbacks
xen_platform: Do not use old_portio-style callbacks
hw/dma.c: Fix conversion of ioport_register* to MemoryRegion
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* stefanha/block:
block: Fix how mirror_run() frees its buffer
win32-aio: Fix how win32_aio_process_completion() frees buffer
scsi-disk: qemu_vfree(NULL) is fine, simplify
w32: Make qemu_vfree() accept NULL like the POSIX implementation
sheepdog: clean up sd_aio_setup()
sheepdog: multiplex the rw FD to flush cache
block: clear dirty bitmap when discarding
ide: issue discard asynchronously but serialize the pieces
ide: fix TRIM with empty range entry
block: make discard asynchronous
raw: support discard on block devices
raw-posix: remember whether discard failed
raw-posix: support discard on more filesystems
block: fix initialization in bdrv_io_limits_enable()
qcow2: Fix segfault on zero-length write
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* afaerber/qom-cpu:
target-i386: Use switch in check_hw_breakpoints()
target-i386: Avoid goto in hw_breakpoint_insert()
target-i386: Introduce hw_{local,global}_breakpoint_enabled()
target-i386: Define DR7 bit field constants
target-i386: Move kvm_check_features_against_host() check to realize time
target-i386: cpu_x86_register() consolidate freeing resources
target-i386: Move setting defaults out of cpu_x86_parse_featurestr()
target-i386: check/enforce: Check all feature words
target-i386/cpu.c: Add feature name array for ext4_features
target-i386: kvm_check_features_against_host(): Use feature_word_info
target-i386/cpu: Introduce FeatureWord typedefs
target-i386: Disable kvm_mmu by default
kvm: Add fake KVM constants to avoid #ifdefs on KVM-specific code
exec: Return CPUState from qemu_get_cpu()
xen: Simplify halting of first CPU
kvm: Pass CPUState to kvm_init_vcpu()
cpu: Move cpu_index field to CPUState
cpu: Move numa_node field to CPUState
target-mips: Clean up mips_cpu_map_tc() documentation
cpu: Move nr_{cores,threads} fields to CPUState
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
[AF: Used HWADDR_PRIx for hwaddr PIIX4_DPRINTF()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The commit 582299336879504353e60c7937fbc70fea93f3da introduced a 1-shift for
some offset in DMA emulation.
Before the previous commit, which converted ioport_register_* to
MemoryRegion, the DMA controller registered 8 ioports with the following
formula:
base + ((8 + i) << d->shift) where 0 <= i < 8
When an IO occured within a Memory Region, DMA callback receives an
offset relative to the start address. Here the start address is:
base + (8 << d->shift).
The offset should be: (i << d->shift). After the shift is reverted, the
offsets are 0..7 not 1..8.
Fixes LP#1089996.
Reported-by: Andreas Gustafsson <gson@gson.org>
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Static linkikng against ncurses may require explicit -ltinfo.
In case -lcurses and -lncurses both didn't work give pkg-config a
chance.
Fixes#1094786 for me.
Signed-off-by: Vadim Evard <v.e.evard@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It allocates with qemu_blockalign(), therefore it must free with
qemu_vfree(), not g_free().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
win32_aio_submit() allocates it with qemu_blockalign(), therefore it
must be freed with qemu_vfree(), not g_free().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
On POSIX, qemu_vfree() accepts NULL, because it's merely wrapper
around free(). As far as I can tell, the Windows implementation
doesn't. Breeds bugs that bite only under Windows.
Make the Windows implementation behave like the POSIX implementation.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The last two parameters of sd_aio_setup() are never used, so remove them.
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will reduce sockfds connected to the sheep server to one, which simply the
future hacks.
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A usage with a hardcoded partial path such as
object_resolve_path_component(obj, "foo")
is totally valid but currently leads to a compilation error. Fix this.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Note that resetting bits in the dirty bitmap is done _before_ actually
processing the request. Writes, instead, set bits after the request
is completed.
This way, when there are concurrent write and discard requests, the
outcome will always be that the blocks are marked dirty. This scenario
should never happen, but it is safer to do it this way.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now that discard can take a long time, make it asynchronous.
Each LBA range entry is processed separately because discard
can be an expensive operation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
ATA-ACS-3 says "If the two byte range length is zero, then the LBA
Range Entry shall be discarded as padding." iovecs are used as if
they are linearized, so it is incorrect to discard the rest of
this iovec.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is easy with the thread pool, because we can use s->is_xfs and
s->has_discard from the worker function.
QEMU has a widespread assumption that each I/O operation writes less
than 2^32 bytes. This patch doesn't fix it throughout of course,
but it starts correcting struct RawPosixAIOData so that there is
no regression with respect to the synchronous discard implementation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Block devices use a ioctl instead of fallocate, so add a separate
implementation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Avoid sending system calls repeatedly if they shall fail. This
does not apply to XFS: if the filesystem-specific ioctl fails,
something weird is happening.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Linux 2.6.38 introduced the filesystem independent interface to
deallocate part of a file. As of Linux 3.7, btrfs, ext4, ocfs2,
tmpfs and xfs support it.
Even though the system calls here are in practice issued on Linux,
the code is structured to allow plugging in alternatives for other Unix
variants. EOPNOTSUPP is used unconditionally in this patch, but it is
supported in both OpenBSD and Mac OS X since forever (see for example
http://lists.debian.org/debian-glibc/2006/02/msg00337.html).
Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_io_limits_enable() starts a new slice, but does not set io_base
correctly for that slice.
Here is how io_base is used:
bytes_base = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write];
bytes_res = (unsigned) nb_sectors * BDRV_SECTOR_SIZE;
if (bytes_base + bytes_res <= bytes_limit) {
/* no wait */
} else {
/* operation needs to be throttled */
}
As a result, any I/O operations that are triggered between now and
bs->slice_end are incorrectly limited. If 10 MB of data has been
written since the VM was started, QEMU thinks that 10 MB of data has
been written in this slice. This leads to a I/O lockup in the guest.
We fix this by delaying the start of a new slice to the next
call of bdrv_exceed_io_limits().
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Replace an if statement using magic numbers for breakpoint type with a
more explicit switch statement. This is to aid readability.
Change the return type and force_dr6_update argument type to bool.
While at it, fix Coding Style issues (missing braces).
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
"Go To Statement Considered Harmful" -- E. Dijkstra
To avoid an unnecessary goto within the switch statement, move
watchpoint insertion out of the switch statement. Improves readability.
While at it, fix Coding Style issues (missing braces, indentation).
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
hw_breakpoint_enabled() returned a bit field indicating whether a local
breakpoint and/or global breakpoint was enabled. Avoid this number magic
by using explicit boolean helper functions hw_local_breakpoint_enabled()
and hw_global_breakpoint_enabled(), to aid readability.
Reuse them for the hw_breakpoint_enabled() implementation and change
its return type to bool.
While at it, fix Coding Style issues (missing braces).
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Implicit use of dr7 bit field is a little hard to understand,
so define constants for them and use them consistently.
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
One of the recent refactoring patches (commit f50f88b9) didn't take care
to initialise l2meta properly, so with zero-length writes, which don't
even enter the write loop, qemu just segfaulted.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
kvm_check_features_against_host() should be called when features can't
be changed, and when features are converted to properties it would be
possible to change them until realize time, so correct way is to call
kvm_check_features_against_host() in x86_cpu_realize().
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Freeing resources in one place would require setting 'error'
to not NULL, so add some more error reporting before jumping to
exit branch.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
No functional change, needed for simplifying conversion to properties.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
This adds the following feature words to the list of flags to be checked
by kvm_check_features_against_host():
- cpuid_7_0_ebx_features
- ext4_features
- kvm_features
- svm_features
This will ensure the "enforce" flag works as it should: it won't allow
QEMU to be started unless every flag that was requested by the user or
defined in the CPU model is supported by the host.
This patch may cause existing configurations where "enforce" wasn't
preventing QEMU from being started to abort QEMU. But that's exactly the
point of this patch: if a flag was not supported by the host and QEMU
wasn't aborting, it was a bug in the "enforce" code.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Feature names were taken from the X86_FEATURE_* constants in the Linux
kernel code.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Instead of carrying the CPUID leaf/register and feature name array on
the model_features_t struct, move that information into
feature_word_info so it can be reused by other functions.
The goal is to eventually kill model_features_t entirely, but to do that
we have to either convert x86_def_t.features to an array or use
offsetof() inside FeatureWordInfo (to replace the pointers inside
model_features_t). So by now just move most of the model_features_t
fields to FeatureWordInfo except for the two pointers to local
arguments.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
This introduces a FeatureWord enum, FeatureWordInfo struct (with
generation information about a feature word), and a FeatureWordArray
typedef, and changes add_flagname_to_bitmaps() code and
cpu_x86_parse_featurestr() to use the new typedefs instead of separate
variables for each feature word.
This will help us keep the code at kvm_check_features_against_host(),
cpu_x86_parse_featurestr() and add_flagname_to_bitmaps() sane while
adding new feature name arrays.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
KVM_CAP_PV_MMU capability reporting was removed from the kernel since
v2.6.33 (see commit a68a6a7282373), and was completely removed from the
kernel since v3.3 (see commit fb92045843). It doesn't make sense to keep
it enabled by default, as it would cause unnecessary hassle when using
the "enforce" flag.
This disables kvm_mmu on all machine-types. With this fix, the possible
scenarios when migrating from QEMU <= 1.3 to QEMU 1.4 are:
------------+----------+----------------------------------------------------
src kernel | dst kern.| Result
------------+----------+----------------------------------------------------
>= 2.6.33 | any | kvm_mmu was already disabled and will stay disabled
<= 2.6.32 | >= 3.3 | correct live migration is impossible
<= 2.6.32 | <= 3.2 | kvm_mmu will be disabled on next guest reboot *
------------+----------+----------------------------------------------------
* If they are running kernel <= 2.6.32 and want kvm_mmu to be kept
enabled on guest reboot, they can explicitly add +kvm_mmu to the QEMU
command-line. Using 2.6.33 and higher, it is not possible to enable
kvm_mmu explicitly anymore.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>