These functions are more efficient in the presence of contention.
qemu_co_rwlock_downgrade also guarantees not to block, which may
be useful in some algorithms too.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-3-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
This defines new QOM object - IOMMUMemoryRegion - with MemoryRegion
as a parent.
This moves IOMMU-related fields from MR to IOMMU MR. However to avoid
dymanic QOM casting in fast path (address_space_translate, etc),
this adds an @is_iommu boolean flag to MR and provides new helper to
do simple cast to IOMMU MR - memory_region_get_iommu. The flag
is set in the instance init callback. This defines
memory_region_is_iommu as memory_region_get_iommu()!=NULL.
This switches MemoryRegion to IOMMUMemoryRegion in most places except
the ones where MemoryRegion may be an alias.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20170711035620.4232-2-aik@ozlabs.ru>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add warn_report(), warn_vreport() for reporting warnings, and
info_report(), info_vreport() for informational messages.
These are implemented them with a helper function factored out of
error_vreport(), suitably generalized. This patch makes no changes
to the output of the original error_report() function.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <c89e9980019f296ec9aa38d7689ac4d5c369296d.1499866456.git.alistair.francis@xilinx.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Add bdrv_dirty_bitmap_deserialize_ones() function, which is needed for
qcow2 bitmap loading, to handle unallocated bitmap parts, marked as
all-ones.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-7-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Make dirty iter resistant to resetting bits in corresponding HBitmap.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-4-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Now that qcow & qcow2 are wired up to get encryption keys
via the QCryptoSecret object, nothing is relying on the
interactive prompting for passwords. All the code related
to password prompting can thus be ripped out.
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-17-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The user interface specifies job rate limits in bytes/second.
It's pointless to have our internal representation track things
in sectors/second, particularly since we want to move away from
sector-based interfaces.
Fix up a doc typo found while verifying that the ratelimit
code handles the scaling difference.
Repetition of expressions like 'n * BDRV_SECTOR_SIZE' will be
cleaned up later when functions are converted to iterate over
images by bytes rather than by sectors.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Not all platforms check whether a lock is initialized before used. In
particular Linux seems to be more permissive than OSX.
Check initialization state explicitly in our code to catch such bugs
earlier.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170704122325.25634-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cleanup: Create and use a typedef for PS2State and stop passing void
pointers. No functional change.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170606112105.13331-2-kraxel@redhat.com
Add helpers to gather cache info from the host at init-time.
For now, only export the host's I/D cache line sizes, which we
will use to improve cache locality to avoid false sharing.
Suggested-by: Richard Henderson <rth@twiddle.net>
Suggested-by: Geert Martin Ijewski <gm.ijewski@web.de>
Tested-by: Geert Martin Ijewski <gm.ijewski@web.de>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1496794624-4083-1-git-send-email-cota@braap.org>
[rth: Move all implementations from tcg/ppc/]
Signed-off-by: Richard Henderson <rth@twiddle.net>
This module provides fast paths for 64-bit atomic operations on machines
that only have 32-bit atomic access.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170605123908.18777-11-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
That typedefs are needed on both files. New compilers (F25 where I
work) don't complain about repeating a typedef. But older ones
complain.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested and confirmed that the stretch i386 debian qcow2 image on a
raspberry pi 2 works.
Fixes: LP#: 893208 <https://bugs.launchpad.net/qemu/+bug/893208/>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170418191817.10430-1-bobby.prani@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fix regression from commit 4d43a603c7, where the serial and parallel
headers got removed from char.c, which broke the alias table.
Move the HAVE_CHARDEV_SERIAL/HAVE_CHARDEV_PARPORT to osdep.h instead
of being in separate headers.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We need to coordinate with the TCG_OVERSIZED_GUEST test in cputlb.c,
and allow 64-bit atomics even though sizeof(void *) == 4.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
So we remove all traces of them.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Not used anymore after moving block migration to use capabilities.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Highlights:
* New "-numa cpu" option
* NUMA distance configuration
* migration/i386 vmstatification
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJZFLh3AAoJECgHk2+YTcWmppQQAJe9Y5a3VwHXqvHbwBHX2ysn
RZDAUPd9DWpbM+UUydyKOVIZ7u5RXbbVq4E0NeCD8VYYd+grZB5Wo1cAzy3b4U2j
2s+MDqaPMtZtGoqxTsyQOVoVxazT5Kf1zglK+iUEzik44J7LGdro+ty2Z7Ut2c11
q9rE/GNS78czBm7c4lxgkxXW4N95K/tEGlLtDQ7uct//3U/ZimF+mO6GcbVFlOWT
4iEbOz2sqvBVv22nLJRufiPgFNIW4hizAz5KBWxwGFCCKvT3N6yYNKKjzEpCw+jE
lpjIRODU02yIZZZY841fLRtyrk7p4zORS8jRaHTdEJgb5bGc/YazxxVL8nzRQT1W
VxFwAMd+UNrDkV24hpN++Ln2O+b3kwcGZ7uA/qu9d5WvSYUKXlHqcMJ35q6zuhAI
/ecfYO7EZfVP86VjIt5IH04iV8RChA9Q6de+kQEFa6wHUxufeCOwCFqukGo8zj07
plX8NcjnzYmSXKnYjHOHao4rKT+DiJhRB60rFiMeKP/qvKbZPjtgsIeonhHm53qZ
/QwkhowahHKkpAnetIl0QHm8KS4YudAofMi/Fl+he4gRkEbSQVAo6iQb2L4cjcLC
LNSDDsIVWGem4gCR+vcsFqB3lggRDfltHXm15JKh92UMpOr6RI6s8pD55T7EdnPC
CfdxWB5kYM6/lLbOHj94
=48wH
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'ehabkost/tags/x86-and-machine-pull-request' into staging
x86 and machine queue, 2017-05-11
Highlights:
* New "-numa cpu" option
* NUMA distance configuration
* migration/i386 vmstatification
# gpg: Signature made Thu 11 May 2017 08:16:07 PM BST
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# gpg: Note: This key has expired!
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* ehabkost/tags/x86-and-machine-pull-request: (29 commits)
migration/i386: Remove support for pre-0.12 formats
vmstatification: i386 FPReg
migration/i386: Remove old non-softfloat 64bit FP support
tests: check -numa node,cpu=props_list usecase
numa: add '-numa cpu,...' option for property based node mapping
numa: remove node_cpu bitmaps as they are no longer used
numa: use possible_cpus for not mapped CPUs check
machine: call machine init from wrapper
numa: remove no longer need numa_post_machine_init()
tests: numa: add case for QMP command query-cpus
QMP: include CpuInstanceProperties into query_cpus output output
virt-arm: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
spapr: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
pc: get numa node mapping from possible_cpus instead of numa_get_node_for_cpu()
numa: do default mapping based on possible_cpus instead of node_cpu bitmaps
numa: mirror cpu to node mapping in MachineState::possible_cpus
numa: add check that board supports cpu_index to node mapping
virt-arm: add node-id property to CPU
pc: add node-id property to CPU
spapr: add node-id property to sPAPR core
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
ctpopl() has a better implementation than hweight_long() and ui/vnc.c
being the last user of hweight_long(), we can simply remove it.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1489415605-13105-1-git-send-email-clg@kaod.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When there are more nodes than available memory to put the minimum
allowed memory by node, all the memory is put on the last node.
This is because we put (ram_size / nb_numa_nodes) &
~((1 << mc->numa_mem_align_shift) - 1); on each node, and in this
case the value is 0. This is particularly true with pseries,
as the memory must be aligned to 256MB.
To avoid this problem, this patch uses an error diffusion algorithm [1]
to distribute equally the memory on nodes.
We introduce numa_auto_assign_ram() function in MachineClass
to keep compatibility between machine type versions.
The legacy function is used with pseries-2.9, pc-q35-2.9 and
pc-i440fx-2.9 (and previous), the new one with all others.
Example:
qemu-system-ppc64 -S -nographic -nodefaults -monitor stdio -m 1G -smp 8 \
-numa node -numa node -numa node \
-numa node -numa node -numa node
Before:
(qemu) info numa
6 nodes
node 0 cpus: 0 6
node 0 size: 0 MB
node 1 cpus: 1 7
node 1 size: 0 MB
node 2 cpus: 2
node 2 size: 0 MB
node 3 cpus: 3
node 3 size: 0 MB
node 4 cpus: 4
node 4 size: 0 MB
node 5 cpus: 5
node 5 size: 1024 MB
After:
(qemu) info numa
6 nodes
node 0 cpus: 0 6
node 0 size: 0 MB
node 1 cpus: 1 7
node 1 size: 256 MB
node 2 cpus: 2
node 2 size: 0 MB
node 3 cpus: 3
node 3 size: 256 MB
node 4 cpus: 4
node 4 size: 256 MB
node 5 cpus: 5
node 5 size: 256 MB
[1] https://en.wikipedia.org/wiki/Error_diffusion
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170502162955.1610-2-lvivier@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: s/ram_size/size/ at numa_default_auto_assign_ram()]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
They are wrappers of POSIX fcntl "file private locking", with a
convenient "try lock" wrapper implemented with F_OFD_GETLK.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
SocketAddressLegacy is a simple union, and simple unions are awkward:
they have their variant members wrapped in a "data" object on the
wire, and require additional indirections in C. SocketAddress is the
equivalent flat union. Convert all users of SocketAddressLegacy to
SocketAddress, except for existing external interfaces.
See also commit fce5d53..9445673 and 85a82e8..c5f1ae3.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-7-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Minor editing accident fixed, commit message and a comment tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The next commit will rename SocketAddressFlat to SocketAddress, and
the commit after that will replace most uses of SocketAddressLegacy by
SocketAddress, replacing most of this commit's renames right back.
Note that checkpatch emits a few "line over 80 characters" warnings.
The long lines are all temporary; the SocketAddressLegacy replacement
will shorten them again.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-5-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
I'm going to flatten SocketAddress: rename SocketAddress to
SocketAddressLegacy, SocketAddressFlat to SocketAddress, eliminate
SocketAddressLegacy except in external interfaces.
inet_parse() returns a newly allocated InetSocketAddress. Lift the
allocation from inet_parse() into its caller socket_parse() to prepare
for flattening SocketAddress.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Straightforward rebase]
QEMU_BUILD_BUG_ON should use C11's _Static_assert, if the compiler supports it,
to provide more readable messages on failure.
We check for _Static_assert in configure, and set CONFIG_STATIC_ASSERT
accordingly. QEMU_BUILD_BUG_ON invokes _Static_assert if CONFIG_STATIC_ASSERT
is defined, and reverts to the old way otherwise.
That way, systems without C11 conforming compiler will still have the old
messages, as verified by intentionally breaking the configure check.
the following example output was generated by inverting the condition in
QEMU_BUILD_BUG_ON:
without _Static_assert:
> In file included from /qemu/include/qemu/osdep.h:36:0,
> from /qemu/qga/commands.c:13:
> /qemu/qga/commands.c: In function ‘qmp_guest_exec_status’:
> /qemu/include/qemu/compiler.h:89:12: error: negative width in bit-field ‘<anonymous>’
> struct { \
> ^
> /qemu/include/qemu/compiler.h:96:38: note: in expansion of macro QEMU_BUILD_BUG_ON_STRUCT’
> #define QEMU_BUILD_BUG_ON(x) typedef QEMU_BUILD_BUG_ON_STRUCT(x) \
> ^~~~~~~~~~~~~~~~~~~~~~~~
> /qemu/include/qemu/atomic.h:146:5: note: in expansion of macro ‘QEMU_BUILD_BUG_ON’
> QEMU_BUILD_BUG_ON(sizeof(*ptr) > sizeof(void *)); \
> ^~~~~~~~~~~~~~~~~
> /qemu/include/qemu/atomic.h:417:5: note: in expansion of macro ‘atomic_load_acquire’
> atomic_load_acquire(ptr)
> ^~~~~~~~~~~~~~~~~~~
> /qemu/qga/commands.c:160:21: note: in expansion of macro ‘atomic_mb_read’
> bool finished = atomic_mb_read(&gei->finished);
> ^~~~~~~~~~~~~~
with _Static_assert:
> In file included from /qemu/include/qemu/osdep.h:36:0,
> from /qemu/qga/commands.c:13:
> /qemu/qga/commands.c: In function ‘qmp_guest_exec_status’:
> /qemu/include/qemu/compiler.h:94:30: error: static assertion failed: "not expecting: sizeof(*&gei->finished) > sizeof(void *)"
> #define QEMU_BUILD_BUG_ON(x) _Static_assert((x), #x)
> ^
> /qemu/include/qemu/atomic.h:146:5: note: in expansion of macro ‘QEMU_BUILD_BUG_ON’
> QEMU_BUILD_BUG_ON(sizeof(*ptr) > sizeof(void *)); \
> ^~~~~~~~~~~~~~~~~
> /qemu/include/qemu/atomic.h:417:5: note: in expansion of macro ‘atomic_load_acquire’
> atomic_load_acquire(ptr)
> ^~~~~~~~~~~~~~~~~~~
> /qemu/qga/commands.c:160:21: note: in expansion of macro ‘atomic_mb_read’
> bool finished = atomic_mb_read(&gei->finished);
> ^~~~~~~~~~~~~~
Signed-off-by: Andreas Grapentin <andreas@grapentin.org>
Message-Id: <20170314165953.18506-1-andreas@grapentin.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds support for getting and using a local copy of the dirty
bitmap.
memory_region_snapshot_and_clear_dirty() will create a snapshot of the
dirty bitmap for the specified range, clear the dirty bitmap and return
the copy. The returned bitmap can be a bit larger than requested, the
range is expanded so the code can copy unsigned longs from the bitmap
and avoid atomic bit update operations.
memory_region_snapshot_get_dirty() will return the dirty status of
pages, pretty much like memory_region_get_dirty(), but using the copy
returned by memory_region_copy_and_clear_dirty().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-3-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
We already require gcc 4.1 or newer (for the atomic
support), so the fallback codepaths for older gcc
versions than that are now dead code and we can
just delete them.
NB: clang reports itself as gcc 4.2 (regardless of
clang version), so clang won't be using the fallbacks
either.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
It's a variant of qemu_coroutine_enter with an explicit AioContext
parameter.
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
By holding off updates to timer_state.qemu_icount we can run into
trouble when the non-vCPU thread needs to know the time. This helper
ensures we atomically update timers_state.qemu_icount based on what
has been currently executed.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
SocketAddress is a simple union, and simple unions are awkward: they
have their variant members wrapped in a "data" object on the wire, and
require additional indirections in C. I intend to limit its use to
existing external interfaces. New ones should use SocketAddressFlat.
I further intend to convert all internal interfaces to
SocketAddressFlat. This helper should go away then.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1490895797-29094-8-git-send-email-armbru@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The multithreaded TCG implementation exposed deadlocks in the win32
condition variables: as implemented, qemu_cond_broadcast waited on
receivers, whereas the pthreads API it was intended to emulate does
not. This was causing a deadlock because broadcast was called while
holding the IO lock, as well as all possible waiters blocked on the
same lock.
This patch replaces all the custom synchronisation code for mutexes
and condition variables with native Windows primitives (SRWlocks and
condition variables) with the same semantics as their POSIX
equivalents. To enable that, it requires a Windows Vista or newer host
OS.
Signed-off-by: Andrey Shedel <ashedel@microsoft.com>
[AB: edited commit message]
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-Id: <20170324220141.10104-1-Andrew.Baumann@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu-ga's socket activation support was not obeying the LISTEN_PID
environment variable, which avoids that a process uses a socket-activation
file descriptor meant for its parent.
Mess can for example ensue if a process forks a children before consuming
the socket-activation file descriptor and therefore setting O_CLOEXEC
on it.
Luckily, qemu-nbd also got socket activation code, and its copy does
support LISTEN_PID. Some extra fixups are needed to ensure that the
code can be used for both, but that's what this patch does. The
main change is to replace get_listen_fds's "consume" argument with
the FIRST_SOCKET_ACTIVATION_FD macro from the qemu-nbd code.
Cc: "Richard W.M. Jones" <rjones@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
icount has become much slower after tcg_cpu_exec has stopped
using the BQL. There is also a latent bug that is masked by
the slowness.
The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL
timer now has to wake up the I/O thread and wait for it. The rendez-vous
is mediated by the BQL QemuMutex:
- handle_icount_deadline wakes up the I/O thread with BQL taken
- the I/O thread wakes up and waits on the BQL
- the VCPU thread releases the BQL a little later
- the I/O thread raises an interrupt, which calls qemu_cpu_kick
- the VCPU thread notices the interrupt, takes the BQL to
process it and waits on it
All this back and forth is extremely expensive, causing a 6 to 8-fold
slowdown when icount is turned on.
One may think that the issue is that the VCPU thread is too dependent
on the BQL, but then the latent bug comes in. I first tried removing
the BQL completely from the x86 cpu_exec, only to see everything break.
The only way to fix it (and make everything slow again) was to add a dummy
BQL lock/unlock pair.
This is because in -icount mode you really have to process the events
before the CPU restarts executing the next instruction. Therefore, this
series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in
the vCPU thread when running in icount mode.
The required changes include:
- make the timer notification callback wake up TCG's single vCPU thread
when run from another thread. By using async_run_on_cpu, the callback
can override all_cpu_threads_idle() when the CPU is halted.
- move handle_icount_deadline after qemu_tcg_wait_io_event, so that
the timer notification callback is invoked after the dummy work item
wakes up the vCPU thread
- make handle_icount_deadline run the timers instead of just waking the
I/O thread.
- stop processing the timers in the main loop
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is no change for now, because the callback just invokes
qemu_notify_event.
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This dependency is the wrong way, and we will need util/qemu-timer.h from
sysemu/cpus.h in the next patch.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using "-mem-prealloc" option for a large guest leads to higher guest
start-up and migration time. This is because with "-mem-prealloc" option
qemu tries to map every guest page (create address translations), and
make sure the pages are available during runtime. virsh/libvirt by
default, seems to use "-mem-prealloc" option in case the guest is
configured to use huge pages. The patch tries to map all guest pages
simultaneously by spawning multiple threads. Currently limiting the
change to QEMU library functions on POSIX compliant host only, as we are
not sure if the problem exists on win32. Below are some stats with
"-mem-prealloc" option for guest configured to use huge pages.
------------------------------------------------------------------------
Idle Guest | Start-up time | Migration time
------------------------------------------------------------------------
Guest stats with 2M HugePage usage - single threaded (existing code)
------------------------------------------------------------------------
64 Core - 4TB | 54m11.796s | 75m43.843s
64 Core - 1TB | 8m56.576s | 14m29.049s
64 Core - 256GB | 2m11.245s | 3m26.598s
------------------------------------------------------------------------
Guest stats with 2M HugePage usage - map guest pages using 8 threads
------------------------------------------------------------------------
64 Core - 4TB | 5m1.027s | 34m10.565s
64 Core - 1TB | 1m10.366s | 8m28.188s
64 Core - 256GB | 0m19.040s | 2m10.148s
-----------------------------------------------------------------------
Guest stats with 2M HugePage usage - map guest pages using 16 threads
-----------------------------------------------------------------------
64 Core - 4TB | 1m58.970s | 31m43.400s
64 Core - 1TB | 0m39.885s | 7m55.289s
64 Core - 256GB | 0m11.960s | 2m0.135s
-----------------------------------------------------------------------
Changed in v2:
- modify number of memset threads spawned to min(smp_cpus, 16).
- removed 64GB memory restriction for spawning memset threads.
Changed in v3:
- limit number of threads spawned based on
min(sysconf(_SC_NPROCESSORS_ONLN), 16, smp_cpus)
- implement memset thread specific siglongjmp in SIGBUS signal_handler.
Changed in v4
- remove sigsetjmp/siglongjmp and SIGBUS unblock/block for main thread
as main thread no longer touches any pages.
- simplify code my returning memset_thread_failed status from
touch_all_pages.
Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com>
Message-Id: <1487907103-32350-1-git-send-email-jitendra.kolhe@hpe.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
keyval_parse() parses KEY=VALUE,... into a QDict. Works like
qemu_opts_parse(), except:
* Returns a QDict instead of a QemuOpts (d'oh).
* Supports nesting, unlike QemuOpts: a KEY is split into key
fragments at '.' (dotted key convention; the block layer does
something similar on top of QemuOpts). The key fragments are QDict
keys, and the last one's value is updated to VALUE.
* Each key fragment may be up to 127 bytes long. qemu_opts_parse()
limits the entire key to 127 bytes.
* Overlong key fragments are rejected. qemu_opts_parse() silently
truncates them.
* Empty key fragments are rejected. qemu_opts_parse() happily
accepts empty keys.
* It does not store the returned value. qemu_opts_parse() stores it
in the QemuOptsList.
* It does not treat parameter "id" specially. qemu_opts_parse()
ignores all but the first "id", and fails when its value isn't
id_wellformed(), or duplicate (a QemuOpts with the same ID is
already stored). It also screws up when a value contains ",id=".
* Implied value is not supported. qemu_opts_parse() desugars "foo" to
"foo=on", and "nofoo" to "foo=off".
* An implied key's value can't be empty, and can't contain ','.
I intend to grow this into a saner replacement for QemuOpts. It'll
take time, though.
Note: keyval_parse() provides no way to do lists, and its key syntax
is incompatible with the __RFQDN_ prefix convention for downstream
extensions, because it blindly splits at '.', even in __RFQDN_. Both
issues will be addressed later in the series.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1488317230-26248-4-git-send-email-armbru@redhat.com>
The way we get QMP commands registered is high tech:
* qapi-commands.py generates qmp_init_marshal() that does the actual work
* it also generates the magic to register it as a MODULE_INIT_QAPI
function, so it runs when someone calls
module_call_init(MODULE_INIT_QAPI)
* main() calls module_call_init()
QEMU needs to register a few non-qapified commands. Same high tech
works: monitor.c has its own qmp_init_marshal() along with the magic
to make it run in module_call_init(MODULE_INIT_QAPI).
QEMU also needs to unregister commands that are not wanted in this
build's configuration (commit 5032a16). Simple enough:
qmp_unregister_commands_hack(). The difficulty is to make it run
after the generated qmp_init_marshal(). We can't simply run it in
monitor.c's qmp_init_marshal(), because the order in which the
registered functions run is indeterminate. So qmp_init_marshal()
registers qmp_unregister_commands_hack() separately. Since
registering *appends* to the list of registered functions, this will
make it run after all the functions that have been registered already.
I suspect it takes a long and expensive computer science education to
not find this silly.
Dumb it down as follows:
* Drop MODULE_INIT_QAPI entirely
* Give the generated qmp_init_marshal() external linkage.
* Call it instead of module_call_init(MODULE_INIT_QAPI)
* Except in QEMU proper, call new monitor_init_qmp_commands() that in
turn calls the generated qmp_init_marshal(), registers the
additional commands and unregisters the unwanted ones.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-5-git-send-email-armbru@redhat.com>
This will probably be my last pull request before the hard freeze. It
has some new work, but that has all been posted in draft before the
soft freeze, so I think it's reasonable to include in qemu-2.9.
This batch has:
* A substantial amount of POWER9 work
* Implements the legacy (hash) MMU for POWER9
* Some more preliminaries for implementing the POWER9 radix
MMU
* POWER9 has_work
* Basic POWER9 compatibility mode handling
* Removal of some premature tests
* Some cleanups and fixes to the existing MMU code to make the
POWER9 work simpler
* A bugfix for TCG multiply adds on power
* Allow pseries guests to access PCIe extended config space
This also includes a code-motion not strictly in ppc code - moving
getrampagesize() from ppc code to exec.c. This will make some future
VFIO improvements easier, Paolo said it was ok to merge via my tree.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYuOEEAAoJEGw4ysog2bOSHD0P/jBg/qr/4KnsB1KhnlVrB2sP
vy2d3bGGlUWr9Z+CK/PMCRB8ekFgQLjidLIXji6mviUocv6m3WsVrnbLF/oOL/IT
NPMVAffw7q804YVu1Ns9R82d6CIqHTy//bpg69tFMcJmhL9fqPan3wTZZ9JeiyAm
SikqkAHBSW4SxKqg8ApaSqx5L2QTqyfkClR0sLmgM0JtmfJrbobpQ6bMtdPjUZ9L
n2gnpO2vaWCa1SEQrRrdELqvcD8PHkSJapWOBXOkpGWxoeov/PYxOgkpdDUW4qYY
lVLtp1Vd3OB/h3Unqfw32DNiHA5p89hWPX5UybKMgRVL9Cv2/lyY47pcY8XTeNzn
bv84YRbFJeI+GgoEnghmtq+IM8XiW/cr9rWm9wATKfKGcmmFauumALrsffUpHVCM
4hSNgBv5t2V9ptZ+MDlM/Ku+zk9GoqwQ+hemdpVtiyhOtGUPGFBn5YLE4c2DHFxV
+L9JtBnFn8obnssNoz0wL+QvZchT1qUHMhH5CWAanjw9CTDp/YwQ2P01zK+00s9d
4cB7fUG3WNto5eXXEGMaXeDsUEz8z//hTe3j5sVbnHsXi0R3dhv7iryifmx4bUKU
H9EwAc+uNUHbvBy7u6IWg0I8P2n00CCO6JqXijQ92zELJ5j0XhzHUI2dOXn+zyEo
3FZu56LFnSSUBEXuTjq4
=PcNw
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170303' into staging
ppc patch queuye for 2017-03-03
This will probably be my last pull request before the hard freeze. It
has some new work, but that has all been posted in draft before the
soft freeze, so I think it's reasonable to include in qemu-2.9.
This batch has:
* A substantial amount of POWER9 work
* Implements the legacy (hash) MMU for POWER9
* Some more preliminaries for implementing the POWER9 radix
MMU
* POWER9 has_work
* Basic POWER9 compatibility mode handling
* Removal of some premature tests
* Some cleanups and fixes to the existing MMU code to make the
POWER9 work simpler
* A bugfix for TCG multiply adds on power
* Allow pseries guests to access PCIe extended config space
This also includes a code-motion not strictly in ppc code - moving
getrampagesize() from ppc code to exec.c. This will make some future
VFIO improvements easier, Paolo said it was ok to merge via my tree.
# gpg: Signature made Fri 03 Mar 2017 03:20:36 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.9-20170303:
target/ppc: rewrite f[n]m[add,sub] using float64_muladd
spapr: Small cleanup of PPC MMU enums
spapr_pci: Advertise access to PCIe extended config space
target/ppc: Rework hash mmu page fault code and add defines for clarity
target/ppc: Move no-execute and guarded page checking into new function
target/ppc: Add execute permission checking to access authority check
target/ppc: Add Instruction Authority Mask Register Check
hw/ppc/spapr: Add POWER9 to pseries cpu models
target/ppc/POWER9: Add cpu_has_work function for POWER9
target/ppc/POWER9: Add POWER9 pa-features definition
target/ppc/POWER9: Add POWER9 mmu fault handler
target/ppc: Don't gen an SDR1 on POWER9 and rework register creation
target/ppc: Add patb_entry to sPAPRMachineState
target/ppc/POWER9: Add POWERPC_MMU_V3 bit
powernv: Don't test POWER9 CPU yet
exec, kvm, target-ppc: Move getrampagesize() to common code
target/ppc: Add POWER9/ISAv3.00 to compat_table
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move the KVM "eat signals" code under CONFIG_LINUX, in preparation
for moving it to kvm-all.c; reraise non-MCE SIGBUS immediately,
without passing it to KVM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The cast is there because sigbus_handler is invoked via sigfd_handler.
But it feels just wrong to use struct qemu_signalfd_siginfo in the
prototype of a function that is passed to sigaction.
Instead, do a simple-minded conversion of qemu_signalfd_siginfo to
siginfo_t.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
getrampagesize() returns the largest supported page size and mainly
used to know if huge pages are enabled.
However is implemented in target-ppc/kvm.c and not available
in TCG or other architectures.
This renames and moves gethugepagesize() to mmap-alloc.c where
fd-based analog of it is already implemented. This renames and moves
getrampagesize() to exec.c as it seems to be the common place for
helpers like this.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Similarly to allocation, do it from an inline function. This allows
tests to only use the headers for allocation/free of timer.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
This patch removes the redundant throttle code that was present in
block and fsdev device files. Now the common code is moved
to a single file.
Signed-off-by: Pradeep Jagadeesh <pradeep.jagadeesh@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
(fix indent nit, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
This will permit its use in parse_option_size().
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com> (maintainer:X86)
Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core)
Cc: Max Reitz <mreitz@redhat.com> (supporter:Block layer core)
Cc: qemu-block@nongnu.org (open list:Block layer core)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1487708048-2131-24-git-send-email-armbru@redhat.com>
This makes qemu_strtosz(), qemu_strtosz_mebi() and
qemu_strtosz_metric() similar to qemu_strtoi64(), except negative
values are rejected.
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com> (maintainer:X86)
Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core)
Cc: Max Reitz <mreitz@redhat.com> (supporter:Block layer core)
Cc: qemu-block@nongnu.org (open list:Block layer core)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1487708048-2131-23-git-send-email-armbru@redhat.com>
Most callers of qemu_strtosz_suffix() pass QEMU_STRTOSZ_DEFSUFFIX_B.
Capture the pattern in new qemu_strtosz().
Inline qemu_strtosz_suffix() into its only remaining caller.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1487708048-2131-17-git-send-email-armbru@redhat.com>
With qemu_strtosz(), no suffix means mebibytes. It's used rarely.
I'm going to add a similar function where no suffix means bytes.
Rename qemu_strtosz() to qemu_strtosz_MiB() to make the name
qemu_strtosz() available for the new function.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1487708048-2131-16-git-send-email-armbru@redhat.com>
To parse numbers with metric suffixes, we use
qemu_strtosz_suffix_unit(nptr, &eptr, QEMU_STRTOSZ_DEFSUFFIX_B, 1000)
Capture this in a new function for legibility:
qemu_strtosz_metric(nptr, &eptr)
Replace test_qemu_strtosz_suffix_unit() by test_qemu_strtosz_metric().
Rename qemu_strtosz_suffix_unit() to do_strtosz() and give it internal
linkage.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1487708048-2131-15-git-send-email-armbru@redhat.com>
The name qemu_strtoll() suggests conversion to long long, but it
actually converts to int64_t. Rename to qemu_strtoi64().
The name qemu_strtoull() suggests conversion to unsigned long long,
but it actually converts to uint64_t. Rename to qemu_strtou64().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1487708048-2131-7-git-send-email-armbru@redhat.com>
This adds a CoMutex around the existing CoQueue. Because the write-side
can just take CoMutex, the old "writer" field is not necessary anymore.
Instead of removing it altogether, count the number of pending writers
during a read-side critical section and forbid further readers from
entering.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-7-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
All that CoQueue needs in order to become thread-safe is help
from an external mutex. Add this to the API.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-6-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will avoid forward references in the next patch. It is also
more logical because CoQueue is not anymore the basic primitive.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-5-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Running a very small critical section on pthread_mutex_t and CoMutex
shows that pthread_mutex_t is much faster because it doesn't actually
go to sleep. What happens is that the critical section is shorter
than the latency of entering the kernel and thus FUTEX_WAIT always
fails. With CoMutex there is no such latency but you still want to
avoid wait and wakeup. So introduce it artificially.
This only works with one waiters; because CoMutex is fair, it will
always have more waits and wakeups than a pthread_mutex_t.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-3-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This uses the lock-free mutex described in the paper '"Blocking without
Locking", or LFTHREADS: A lock-free thread library' by Gidenstam and
Papatriantafilou. The same technique is used in OSv, and in fact
the code is essentially a conversion to C of OSv's code.
[Added missing coroutine_fn in tests/test-aio-multithread.c.
--Stefan]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-2-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
aio_co_wake provides the infrastructure to start a coroutine on a "home"
AioContext. It will be used by CoMutex and CoQueue, so that coroutines
don't jump from one context to another when they go to sleep on a
mutex or waitqueue. However, it can also be used as a more efficient
alternative to one-shot bottom halves, and saves the effort of tracking
which AioContext a coroutine is running on.
aio_co_schedule is the part of aio_co_wake that starts a coroutine
on a remove AioContext, but it is also useful to implement e.g.
bdrv_set_aio_context callbacks.
The implementation of aio_co_schedule is based on a lock-free
multiple-producer, single-consumer queue. The multiple producers use
cmpxchg to add to a LIFO stack. The consumer (a per-AioContext bottom
half) grabs all items added so far, inverts the list to make it FIFO,
and goes through it one item at a time until it's empty. The data
structure was inspired by OSv, which uses it in the very code we'll
"port" to QEMU for the thread-safe CoMutex.
Most of the new code is really tests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213135235.12274-3-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
To iterate over all QemuOpts currently requires using a callback
function which is inconvenient for control flow. Add support for
using iterator functions more directly
QemuOptsIter iter;
QemuOpt *opt;
qemu_opts_iter_init(&iter, opts, "repeated-key");
while ((opt = qemu_opts_iter_next(&iter)) != NULL) {
....do something...
}
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170203120649.15637-8-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This obsoletes ppc-for-2.9-20170112, which had a MacOS build bug.
This is a long overdue ppc pull request for qemu-2.9. It's been a
long time coming due to some holidays and inconveniently timed
problems with testing. So, there's a lot in here:
* More POWER9 instruction implementations for TCG
* The simpler parts of my CPU compatibility mode cleanup
* This changes behaviour to prefer compatibility modes over
"raW" mode for new machine type versions
* New "40p" machine type which is essentially a modernized and
cleaned up "prep". The intention is that it will replace "prep"
once it has some more testing and polish.
* Add pseries-2.9 machine type
* Implement H_SIGNAL_SYS_RESET hypercall
* Consolidate the two alternate CPU init paths in pseries by
making it always go through CPU core objects to initialize CPU
* A number of bugfixes and cleanups
* Stop the guest timebase when the guest is stopped under KVM.
This makes the guest system clock also stop when paused, which
matches the x86 behaviour.
* Some preliminary cleanups leading towards implementation of the
POWER9 MMU.
There are also some changes not strictly related to ppc code, but for
its benefit:
* Limit the pxi-expander-bridge (PXB) device to x86 guests only
(it's essentially a hack to work around historical x86
limitations)
* Some additions to the 128-bit math in host_utils, necessary for
some of the new instructions.
* Revise a number of qtests and enable them for ppc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYko4AAAoJEGw4ysog2bOStEYQAIk0Pd6ifZzJUcTWQaR8+AZ7
nTbzQyWtSHqSAiwBNsykJMFXV1liZVglf2e+VBsrVOwKoU50VOyVm5LspG2z1h8N
Rxe4FGA2MA//2F3+9/AP8Oe3RdsClNCDaXAVuCFRP4xQWxqqwwasChDeS4Ph/cZq
CXnlhKTpk9v5vSCsr64bUOSYh3RPumnQepiBgT82hOo7R+VaJ79AFbTeCYKkd0hY
Sq8g3mg0zOX1ekNXPk1h8oZWqkoZGbqKiXgoy/evGXWURVzTSJO6VTyM65tdwWB7
Zds77gYAYCIYKq+Iwv4iBCmo4KJofjKQcQepQUr+eGDv9syXebtp6fY0btnIS+DX
uGzzaixZNms9r2+FAiIlKwIeQgQvl76lYEGmvBrbrgSOyA/7GAkOId0E0Ul6D5LW
EJSwk9ZDbyE0JBEq6Bx+LClpwye+bpdScU26djQTTcWpFApIeJTyG9V6b1xwulVZ
rw68ZvfMYxktkvhTbEtvk2O9YZI5eQStBJkmJXeOiOduiP93aiC82MM1Jp+82Q1E
4qRVvCpGTwzF3GLFciUKAqmwfYxByo4G0/dwG8qw6WNEemLyXFHV5TkzLhgwl3kC
gDGl5AdH4MXj8NRjuHcDiGXfePBCD578dmz4xo5ZLA2yBavxkRzM8QsEUmD8hf5w
jhLgyKt0G2hNNtOnGOdG
=vLVl
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170202' into staging
ppc patch queue 2017-02-02
This obsoletes ppc-for-2.9-20170112, which had a MacOS build bug.
This is a long overdue ppc pull request for qemu-2.9. It's been a
long time coming due to some holidays and inconveniently timed
problems with testing. So, there's a lot in here:
* More POWER9 instruction implementations for TCG
* The simpler parts of my CPU compatibility mode cleanup
* This changes behaviour to prefer compatibility modes over
"raW" mode for new machine type versions
* New "40p" machine type which is essentially a modernized and
cleaned up "prep". The intention is that it will replace "prep"
once it has some more testing and polish.
* Add pseries-2.9 machine type
* Implement H_SIGNAL_SYS_RESET hypercall
* Consolidate the two alternate CPU init paths in pseries by
making it always go through CPU core objects to initialize CPU
* A number of bugfixes and cleanups
* Stop the guest timebase when the guest is stopped under KVM.
This makes the guest system clock also stop when paused, which
matches the x86 behaviour.
* Some preliminary cleanups leading towards implementation of the
POWER9 MMU.
There are also some changes not strictly related to ppc code, but for
its benefit:
* Limit the pxi-expander-bridge (PXB) device to x86 guests only
(it's essentially a hack to work around historical x86
limitations)
* Some additions to the 128-bit math in host_utils, necessary for
some of the new instructions.
* Revise a number of qtests and enable them for ppc
# gpg: Signature made Thu 02 Feb 2017 01:40:16 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.9-20170202: (107 commits)
hw/ppc/pnv: Use error_report instead of hw_error if a ROM file can't be found
ppc/kvm: Handle the "family" CPU via alias instead of registering new types
target/ppc/mmu_hash64: Fix incorrect shift value in amr calculation
target/ppc/mmu_hash64: Fix printing unsigned as signed int
tcg/POWER9: NOOP the cp_abort instruction
target/ppc/debug: Print LPCR register value if register exists
target-ppc: Add xststdc[sp, dp, qp] instructions
target-ppc: Add xvtstdc[sp,dp] instructions
target-ppc: Add MMU model check for booke machines
ppc: switch to constants within BUILD_BUG_ON
target/ppc/cpu-models: Fix/remove bad CPU aliases
target/ppc: Remove unused POWERPC_FAMILY(POWER)
spapr: clock should count only if vm is running
ppc: Remove unused function cpu_ppc601_rtc_init()
target/ppc: Add pcr_supported to POWER9 cpu class definition
powerpc/cpu-models: rename ISAv3.00 logical PVR definition
target-ppc: Add xvcv[hpsp, sphp] instructions
target-ppc: Add xsmulqp instruction
target-ppc: Add xsdivqp instruction
target-ppc: Add xscvsdqp and xscvudqp instructions
...
# Conflicts:
# hw/pci-bridge/Makefile.objs
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It's a familiar pattern: some code uses ARRAY_SIZE, then refactoring
changes the argument from an array to a pointer to a dynamically
allocated buffer. Code keeps compiling but any ARRAY_SIZE calls now
return the size of the pointer divided by element size.
Let's add build time checks to ARRAY_SIZE before we allow more
of these in the code-base.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
QEMU_BUILD_BUG_ON uses a typedef in order to be safe
to use outside functions, but sometimes it's useful
to have a version that can be used within an expression.
Following what Linux does, introduce QEMU_BUILD_BUG_ON_ZERO
that return zero after checking condition at build time.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
There are theoretical concerns that some compilers might not trigger
build failures on attempts to define an array of size (x ? -1 : 1) where
x is a variable and make it a variable sized array instead. Let rewrite
using a struct with a negative bit field size instead as there are no
dynamic bit field sizes. This is similar to what Linux does.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Some headers use QEMU_BUILD_BUG_ON. This causes a problem
if the C file including that header happens to have
QEMU_BUILD_BUG_ON at the same line number.
Fix using a widely available extension: __COUNTER__.
If unavailable, provide a stub.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
All users include the trailing ; anyway, let's require that -
it seems cleaner.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Implements 128-bit left shift and right shift as well as their
testcases. By design, shift silently mods by 128, so the caller is
responsible to assert the shift range if necessary.
Left shift sets the overflow flag if any non-zero digit is shifted out.
Examples:
ulshift(&low, &high, 250, &overflow);
equivalent: n << 122
urshift(&low, &high, -2);
equivalent: n << 126
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[dwg: Added test-shift128 to .gitignore]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Pick a uniform chardev type name.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Bitmaps with a granularity of 58 or above can be neither serialized nor
deserialized (see the comment in the function added in this series for
an explanation). This patch adds a function so that we can check whether
a bitmap actually can be (de-)serialized at all, thus avoiding failing
the necessary assertion in hbitmap_serialization_granularity().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20161115225746.3590-2-mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Add also a missing parenthesis in a comment.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Currently we cannot directly transfer a QTAILQ instance because of the
limitation in the migration code. Here we introduce an approach to
transfer such structures. We created VMStateInfo vmstate_info_qtailq
for QTAILQ. Similar VMStateInfo can be created for other data structures
such as list.
When a QTAILQ is migrated from source to target, it is appended to the
corresponding QTAILQ structure, which is assumed to have been properly
initialized.
This approach will be used to transfer pending_events and ccs_list in spapr
state.
We also create some macros in qemu/queue.h to access a QTAILQ using pointer
arithmetic. This ensures that we do not depend on the implementation
details about QTAILQ in the migration code.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Message-Id: <1484852453-12728-3-git-send-email-duanj@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The existing default_config_files table in arch_init.c has a
single entry, making it completely unnecessary. The whole code
can be replaced by a single qemu_read_config_file() call in vl.c.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170117180051.11958-1-ehabkost@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently DNS resolution is done automatically as part
of the creation of a QIOChannelSocket object instance.
This works ok for network clients where you just end
up a single network socket, but for servers, the results
of DNS resolution may require creation of multiple
sockets.
Introducing a DNS resolver API allows DNS resolution
to be separated from the socket object creation. This
will make it practical to create multiple QIOChannelSocket
instances for servers.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove the useless is_external argument. Since the iohandler
AioContext is never used for block devices, aio_disable_external
is never called on it. This lets us remove stubs/iohandler.c.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is complex, but I think it is reasonably documented in the source.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170112180800.21085-5-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A QemuLockCnt comprises a counter and a mutex, with primitives
to increment and decrement the counter, and to take and release the
mutex. It can be used to do lock-free visits to a data structure
whenever mutexes would be too heavy-weight and the critical section
is too long for RCU.
This could be implemented simply by protecting the counter with the
mutex, but QemuLockCnt is harder to misuse and more efficient.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170112180800.21085-3-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In the context of asynchronous work, if we have a worker coroutine that
didn't yield, the parent coroutine cannot be reentered because it hasn't
yielded yet. In this case we don't even have to reenter the parent
because it will see that the work is already done and won't even yield.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Commit 49cf57281b (vl: delay thread initialization after daemonization)
makes the global mutex is taken after daemonization instead before
daemonization by qemu_init_main_loop().
Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Message-Id: <1480566640-27264-2-git-send-email-baiyaowei@cmss.chinamobile.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's timer to expire, not clock.
Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Message-Id: <1480566640-27264-1-git-send-email-baiyaowei@cmss.chinamobile.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Device models often have to perform multiple access to a single
memory region that is known in advance, but would to use "DMA-style"
functions instead of address_space_map/unmap. This can happen
for example when the data has to undergo endianness conversion.
Introduce a new data structure to cache the result of
address_space_translate without forcing usage of a host address
like address_space_map does.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All the variants for rol/ror have a bug in case where the shift == 0.
For example rol32, would generate:
return (word << 0) | (word >> 32);
Which though works, would be flagged as a runtime error on clang's
sanitizer.
Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* NBD write zeroes support (Eric)
* Memory backend fixes (Haozhong)
* Atomics fix (Alex)
* New AVX512 features (Luwei)
* "make check" logging fix (Paolo)
* Chardev refactoring fallout (Paolo)
* Small checkpatch improvements (Paolo, Jeff)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQExBAABCAAbBQJYGaRPFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
XKgH/RgNtosBTqJsmphkS7wACFAFOf7Uq46ajoKfB66Pt1J/++pFQg4TApPYkb7j
KlKeKmXa7hb6+Jg8325H4zGkGno4kn2dE+OnznaB1xPKwiZVAMQVzQsagsEVqpno
k/5PBVRptIiuHQKyU29Go0CxbWJBTH0O14S7rDK4YDF0YMnuT280HQOI3jdu1igV
G/Q+CMgfk+yXf6GWHE8Z9sNq7n0ha8qgruA/X3NC7+pAvEsUcAP065zwLp9weYuK
W1MU68L7Ub4tRo0SVf1HFkDUNdMv4T4hg+wpGe1GwthJWexHu9x0YAQBy60ykJb6
NtHwjLwCUWtm7AiZD/btsOJPmjk=
=+Dt/
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* NBD bugfix (Changlong)
* NBD write zeroes support (Eric)
* Memory backend fixes (Haozhong)
* Atomics fix (Alex)
* New AVX512 features (Luwei)
* "make check" logging fix (Paolo)
* Chardev refactoring fallout (Paolo)
* Small checkpatch improvements (Paolo, Jeff)
# gpg: Signature made Wed 02 Nov 2016 08:31:11 AM GMT
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (30 commits)
main-loop: Suppress I/O thread warning under qtest
docs/rcu.txt: Fix minor typo
vl: exit qemu on guest panic if -no-shutdown is not set
checkpatch: allow spaces before parenthesis for 'coroutine_fn'
x86: add AVX512_4VNNIW and AVX512_4FMAPS features
slirp: fix CharDriver breakage
qemu-char: do not forward events through the mux until QEMU has started
nbd: Implement NBD_CMD_WRITE_ZEROES on client
nbd: Implement NBD_CMD_WRITE_ZEROES on server
nbd: Improve server handling of shutdown requests
nbd: Refactor conversion to errno to silence checkpatch
nbd: Support shorter handshake
nbd: Less allocation during NBD_OPT_LIST
nbd: Let client skip portions of server reply
nbd: Let server know when client gives up negotiation
nbd: Share common option-sending code in client
nbd: Send message along with server NBD_REP_ERR errors
nbd: Share common reply-sending code in server
nbd: Rename struct nbd_request and nbd_reply
nbd: Rename NbdClientSession to NBDClientSession
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
NBD commit 6d34500b clarified how clients and servers are supposed
to behave before closing a connection. It added NBD_REP_ERR_SHUTDOWN
(for the server to announce it is about to go away during option
haggling, so the client should quit sending NBD_OPT_* other than
NBD_OPT_ABORT) and ESHUTDOWN (for the server to announce it is about
to go away during transmission, so the client should quit sending
NBD_CMD_* other than NBD_CMD_DISC). It also clarified that
NBD_OPT_ABORT gets a reply, while NBD_CMD_DISC does not.
This patch merely adds the missing reply to NBD_OPT_ABORT and teaches
the client to recognize server errors. Actually teaching the server
to send NBD_REP_ERR_SHUTDOWN or ESHUTDOWN would require knowing that
the server has been requested to shut down soon (maybe we could do
that by installing a SIGINT handler in qemu-nbd, which transitions
from RUNNING to a new state that waits for the client to react,
rather than just out-right quitting - but that's a bigger task for
another day).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-15-git-send-email-eblake@redhat.com>
[Move dummy ESHUTDOWN to include/qemu/osdep.h. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reuse the existing locking provided by stdio to keep in_asm, cpu,
op, op_opt, op_ind, and out_asm as contiguous blocks.
While it isn't possible to interleave e.g. in_asm or op_opt logs
because of the TB lock protecting all code generation, it is
possible to interleave cpu logs, or to interleave a cpu dump with
an out_asm dump.
For mingw32, we appear to have no viable solution for this. The locking
functions are not properly exported from the system runtime library.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Leave the implementation of error_vprintf and error_vprintf_unless_qmp
(the latter now trivially wrapped by error_printf_unless_qmp) to
libqemustub.a and monitor.c. This has two advantages: it lets us
remove the monitor_printf and monitor_vprintf stubs, and it lets
tests provide a different implementation of the functions that uses
g_test_message.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477326663-67817-2-git-send-email-pbonzini@redhat.com>
Make inet_connect_saddr() in util/qemu-sockets.c public in order to be
able to use it with InetSocketAddress sockets outside of
util/qemu-sockets.c independently.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It is simpler and a bit faster, and QEMU does not need the contention
callbacks (and thus the fairness) anymore.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477565348-5458-21-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
GRecMutex is new in glib 2.32, so we cannot use it. Introduce
a recursive mutex in qemu-thread instead, which will be used
instead of RFifoLock.
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477565348-5458-20-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Force the use of cmpxchg16b on x86_64.
Wikipedia suggests that only very old AMD64 (circa 2004) did not have
this instruction. Further, it's required by Windows 8 so no new cpus
will ever omit it.
If we truely care about these, then we could check this at startup time
and then avoid executing paths that use it.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Allows Int128 to be used more generally, rather than having to
begin with 64-bit inputs and accumulate.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
While the check against sizeof(void *) is appropriate for
normal usage within qemu, there are places in which we want
wider operaions and have checked for their existance.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This paves the way for upcoming work.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-9-git-send-email-cota@braap.org>
This paves the way for upcoming work.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-8-git-send-email-cota@braap.org>
Making these functional rather than object macros will
prevent later problems with complex macro expansion.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Functions to serialize / deserialize(restore) HBitmap. HBitmap should be
saved to linear sequence of bits independently of endianness and bitmap
array element (unsigned long) size. Therefore Little Endian is chosen.
These functions are appropriate for dirty bitmap migration, restoring
the bitmap in several steps is available. To save performance, every
step writes only the last level of the bitmap. All other levels are
restored by hbitmap_deserialize_finish() as a last step of restoring.
So, HBitmap is inconsistent while restoring.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[Fix left shift operand to 1UL; add "finish" parameter. - Fam]
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-8-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Upon each bit toggle, the corresponding bit in the meta bitmap will be
set.
Signed-off-by: Fam Zheng <famz@redhat.com>
[Amended text inline. --js]
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-3-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
HBitmap is an implementation detail of block dirty bitmap that should be hidden
from users. Introduce a BdrvDirtyBitmapIter to encapsulate the underlying
HBitmapIter.
A small difference in the interface is, before, an HBitmapIter is initialized
in place, now the new BdrvDirtyBitmapIter must be dynamically allocated because
the structure definition is in block/dirty-bitmap.c.
Two current users are converted too.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-2-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This introduces load-acquire and store-release operations in QEMU.
For now, just use them as an implementation detail of atomic_mb_read
and atomic_mb_set.
Since docs/atomics.txt documents that atomic_mb_read only synchronizes
with an atomic_mb_set of the same variable, we can use the new implementation
everywhere instead of seq-cst loads and stores.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
iQEcBAABAgAGBQJX/feXAAoJEJykq7OBq3PIatAIAKoT0dQpc6L0kPM9QYX0soBR
sQ75/EyPyHK+7ly6huS0Zp78RHd98CVePdH/hn+Ha3in87ScBV/NXxgKJ/aUP8Mm
04aMkXgn0tydBndtSTom2sBHOyAl5RUCRH8JBhGQ5p+dbzfeeSU9or4UZdEiChp2
DyvmF7gjjGl8jsDl1G9KsxqOYzwgpxSVZzvzzjjK7ZlJ0fW0gkYInDIzyna/YGJP
Ea7Dc0lLN4wM7rHUk1CgTD8KzSACsmmVy4OfTUo+ij6RugtrCcsiouHqWZ3nYzbN
FHux0hXl02wnUuaZmSChgHxLDVG8pcNcWVt4wuh7wHXo8aNih8UeQLQnkdDnx9o=
=bcLW
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Wed 12 Oct 2016 09:43:03 BST
# gpg: using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/tracing-pull-request:
trace: Add missing execution mode of guest events
trace: introduce a formal group name for trace events
trace: pass trace-events to tracetool as a positional param
trace: push reading of events up a level to tracetool main
trace: rename _read_events to read_events
trace: get rid of generated-events.h/generated-events.c
trace: dynamically allocate event IDs at runtime
trace: dynamically allocate trace_dstate in CPUState
trace: provide mechanism for registering trace events
trace: don't abort qemu if ftrace can't be initialized
trace: emit name <-> ID mapping in simpletrace header
trace: remove the TraceEventID and TraceEventVCPUID enums
trace: give each trace event a named TraceEvent struct
trace: break circular dependency in event-internal.h
trace: remove duplicate control.h includes in generated-tracers.h
trace: remove global 'uint16 dstate[]' array
trace: remove some now unused functions
trace: convert code to use event iterators
trace: add trace event iterator APIs
trace: move colo trace events to net/ sub-directory
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Remove the notion of there being a single global array
of trace events, by introducing a method for registering
groups of events.
The module_call_init() needs to be invoked at the start
of any program that wants to make use of the trace
support. Currently this covers system emulators qemu-nbd,
qemu-img and qemu-io.
[Squashed the following fix from Daniel P. Berrange
<berrange@redhat.com>:
linux-user/bsd-user: initialize trace events subsystem
The bsd-user/linux-user programs make use of the CPU emulation
code and this now requires that the trace events subsystem
is enabled, otherwise it'll crash trying to allocate an empty
trace events bitmap for the CPU object.
--Stefan]
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1475588159-30598-14-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
C99 requires SIZE_MAX to be declared with the same type as the
integral promotion of size_t, but OSX mistakenly defines it as
an 'unsigned long long' expression even though size_t is only
'unsigned long'. Rather than futzing around with whether size_t
is 32- or 64-bits wide (which would be needed if we cared about
using SIZE_T in a #if expression), just hard-code it with a cast.
This is not a strict C99-compliant definition, because it doesn't
work in the preprocessor, but if we later need that, the build
will break on Mac to inform us to improve our replacement at that
time.
See also https://patchwork.ozlabs.org/patch/542327/ for an
instance where the wrong type trips us up if we don't fix it
for good in osdep.h.
Some versions of glibc make a similar mistake with SSIZE_MAX; the
goal is that the approach of this patch could be copied to work
around that problem if it ever becomes important to us.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1476200784-17210-1-git-send-email-eblake@redhat.com
Reviewed-by: John Arbuckle <programmingkidx@gmail.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCAAGBQJX+LTGAAoJEHAbT2saaT5ZIBwH+wfho+xxruEjro6qPvSAtdKk
BBsOWBfBoqWfbAbOxxCO8ina2nA7p5XbyzSXUr94nZhvZMB9BkgL6la03gdS0Yr2
jHf0J9mM8fIbMQFsEKGOPcdpvU7VEXeFwridZYzypiRvbNSdWK3SKVBKgz2ADNhb
l4Tos81IZeH/mw8HcU3XgSGSTV4JuKP4XsnmwlFMa8/sWM/X3vVgx5IG26KURZQm
pW720jcX0meSfji5YvhspfbBbp1g2EorTZb6iLcZf+OUIB6XkViMisVasnyOo2HJ
cehPlhAHixwq1kXGItc1fs11VloZ6hvEZ7kZ615jAdsD2sGJObtGDxgyJW3+gPo=
=HPHj
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-fetch' into staging
trivial patches for 2016-10-08
# gpg: Signature made Sat 08 Oct 2016 09:56:38 BST
# gpg: using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg: aka "Michael Tokarev <mjt@corpit.ru>"
# gpg: aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
# Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931 4B22 701B 4F6B 1A69 3E59
* remotes/mjt/tags/trivial-patches-fetch: (26 commits)
net/filter-mirror: Fix mirror initial check typo
virtio: rename the bar index field name in VirtIOPCIProxy
linux-user: include <poll.h> instead of <sys/poll.h>
char: fix missing return in error path for chardev TLS init
CODING_STYLE: Fix a typo ("have" vs. "has")
bitmap: refine and move BITMAP_{FIRST/LAST}_WORD_MASK
build-sys: fix find-in-path
m68k: change default system clock for m5208evb
exec: remove unused compacted argument
usb: ehci: fix memory leak in ehci_process_itd
qapi: make the json schema files more regular.
maint: Add module_block.h to .gitignore
MAINTAINERS: Some updates related to the SH4 machines
MAINTAINERS: Add some more MIPS related files
MAINTAINERS: Add usermode related config files
MAINTAINERS: Add some more pattern to recognize all win32 related files
MAINTAINERS: Add some more rocker related files
MAINTAINERS: Add header files to CRIS section
MAINTAINERS: Add some more files to the virtio section
MAINTAINERS: Add some SPARC machine related files
...
# Conflicts:
# MAINTAINERS
According to linux kernel commit <89c1e79eb30> ("linux/bitmap.h: improve
BITMAP_{LAST,FIRST}_WORD_MASK"), these two macro could be improved.
This patch takes this change and also move them all in header file.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is a small helper that tries to fetch binary name for given
PID.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <4d75d475c1884f8e94ee8b1e57273ddf3ed68bf7.1474987617.git.mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is a data race if the sequence is written concurrently to the
read. In C11 this has undefined behavior. Use atomic_set; the
read side is already using atomic_read.
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20160930213106.20186-6-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add some notes on the use of the relaxed atomic access helpers and their
importance for defined behaviour in C11's multi-threaded memory model.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20160930213106.20186-3-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Only very modern GCC's actually set this define when building with the
ThreadSanitizer so this little typo slipped though.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20160930213106.20186-2-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
See the doc comments for a description of this new coroutine API.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1474989516-18255-2-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
As discussed on the list [1], having a comment stating that this file
is "public domain" is arguably wrong and not legally binding. This patch
replaces that comment with a clear GPLv2+ license as proposed in [2].
[1] http://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg06151.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg06217.html
Worth noting, compiler.h was originally created on 5c026320 by splitting
qemu-common.h. At the time, qemu-common.h was already GPLv2+.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Message-Id: <1474642971-11866-1-git-send-email-felipe@nutanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Jhash will be used by colo-compare and filter-rewriter
to save and lookup net connection info
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Update all qemu_uuid users as well, especially get rid of the duplicated
low level g_strdup_printf, sscanf and snprintf calls with QEMU UUID API.
Since qemu_uuid_parse is quite tangled with qemu_uuid, its switching to
QemuUUID is done here too to keep everything in sync and avoid code
churn.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-Id: <1474432046-325-10-git-send-email-famz@redhat.com>
A number of different places across the code base use CONFIG_UUID. Some
of them are soft dependency, some are not built if libuuid is not
available, some come with dummy fallback, some throws runtime error.
It is hard to maintain, and hard to reason for users.
Since UUID is a simple standard with only a small number of operations,
it is cleaner to have a central support in libqemuutil. This patch adds
qemu_uuid_* functions that all uuid users in the code base can
rely on. Except for qemu_uuid_generate which is new code, all other
functions are just copy from existing fallbacks from other files.
Note that qemu_uuid_parse is moved without updating the function
signature to use QemuUUID, to keep this patch simple.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-Id: <1474432046-325-2-git-send-email-famz@redhat.com>
Extend the current module interface to allow for block drivers to be
loaded dynamically on request. The only block drivers that can be
converted into modules are the drivers that don't perform any init
operation except for registering themselves.
In addition, only the protocol drivers are being modularized, as they
are the only ones which see significant performance benefits. The format
drivers do not generally link to external libraries, so modularizing
them is of no benefit from a performance perspective.
All the necessary module information is located in a new structure found
in module_block.h
This spoils the purpose of 5505e8b76f (block/dmg: make it modular).
Before this patch, if module build is enabled, block-dmg.so is linked to
libbz2, whereas the main binary is not. In downstream, theoretically, it
means only the qemu-block-extra package depends on libbz2, while the
main QEMU package needn't to. With this patch, we (temporarily) change
the case so that the main QEMU depends on libbz2 again.
Signed-off-by: Marc Marí <markmb@redhat.com>
Signed-off-by: Colin Lord <clord@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1471008424-16465-4-git-send-email-clord@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
[mreitz: Do a signed comparison against the length of
block_driver_modules[], so it will not cause a compile error when
empty]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Unused function declarations were found using a simple gcc plugin and
manually verified by grepping the sources.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1472496380-19706-6-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the two users don't make use of the returned offset,
beyond ensuring that the entire buffer is zero, consider the
can_use_buffer_find_nonzero_offset and buffer_find_nonzero_offset
functions internal.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1472496380-19706-4-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the __atomic_*_n() primitives which take the value as argument. It
is not necessary to store the value locally before calling the
primitive, hence saving us a stack store and load.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160829171701.14025-1-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove the redundant barrier() after the fence as agreed in previous
discussion here:
https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg00489.html
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20160824204424.14041-3-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The comments is outdated. The patch has following changes:
1. tense correction.
2. all clock time value is returned in nanoseconds, so, they are same in
precision.
3. virtual clock doesn't use cpu cycles.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1469790338-28990-2-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
instead of accessing tqe_prev field dircetly outside
of queue.h use macros to check if element is in list
and make sure that afer element is removed from list
tqe_prev field could be used to do the same check.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1469450832-84343-1-git-send-email-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
A coroutine that takes a lock must also release it again. If the
coroutine terminates without having released all its locks, it's buggy
and we'll probably run into a deadlock sooner or later. Make sure that
we don't get such cases.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
In cases of deadlocks, knowing who holds a given CoMutex is really
helpful for debugging. Keeping the information around doesn't cost much
and allows us to add another assertion to keep the code correct, so
let's just add it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
With the latest clang, we have the following warning:
/home/pranith/devops/code/qemu/include/qemu/seqlock.h:62:21: warning: passing 'typeof (*&sl->sequence) *' (aka 'const unsigned int *') to parameter of type 'unsigned int *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
return unlikely(atomic_read(&sl->sequence) != start);
^~~~~~~~~~~~~~~~~~~~~~~~~~
/home/pranith/devops/code/qemu/include/qemu/atomic.h:58:25: note: expanded from macro 'atomic_read'
__atomic_load(ptr, &_val, __ATOMIC_RELAXED); \
^~~~~
Stripping const is a bit tricky due to promotions, but it is doable
with either C11 _Generic or GCC extensions. Use the latter.
Reported-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[pranith: Add conversion for bool type]
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rather than rely on recursion during the middle of register allocation,
lower indirect registers to loads and stores off the indirect base into
plain temps.
For an x86_64 host, with sufficient registers, this results in identical
code, modulo the actual register assignments.
For an i686 host, with insufficient registers, this means that temps can
be (temporarily) spilled to the stack in order to satisfy an allocation.
This as opposed to the possibility of not being able to spill, to allocate
a register for the indirect base, in order to perform a spill.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Make it obvious which macros are safe in which situations.
Useful since QEMU_ALIGN_UP and ROUND_UP both purport to do
the same thing, but differ on whether the alignment must be
a power of 2.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1469129688-22848-4-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since commit e65c67e4, inet_listen() is not used anymore, and all
inet listen operation goes through QIOChannel.
Cc: Daniel P. Berrange <berrange@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1469451771-1173-3-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is never used; all nonblocking connect now goes through
socket_connect(), which calls unix_connect_addr().
Cc: Daniel P. Berrange <berrange@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1469097213-26441-3-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is never used; all nonblocking connect now goes through
socket_connect(), which calls inet_connect_addr().
Cc: Daniel P. Berrange <berrange@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1469097213-26441-2-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When adding hostmem backend at runtime, QEMU might exit with error:
"os_mem_prealloc: Insufficient free host memory pages available to allocate guest RAM"
It happens due to os_mem_prealloc() not handling errors gracefully.
Fix it by passing errp argument so that os_mem_prealloc() could
report error to callers and undo performed allocation when
os_mem_prealloc() fails.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1469008443-72059-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is naturally expected that some memory ordering should be provided
around qht_insert() and qht_lookup(). Document these assumptions in the
header file and put some comments in the source to denote how that
memory ordering requirements are fulfilled.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Sergey Fedorov: commit title and message provided;
comment on qht_remove() elided]
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-Id: <20160715175852.30749-2-sergey.fedorov@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Assertions help both Coverity and the clang static analyzer avoid
false positives, but on the other hand both are confused when
the condition is compiled as (void)(x != FOO). Always expand
assertion macros when using Coverity or clang, through a new
QEMU_STATIC_ANALYSIS preprocessor symbol.
This fixes a couple false positives in TCG.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* fixes to qemu-char and net exit
* FreeBSD fixes
* Other small bugfixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJXhiZDAAoJEL/70l94x66DrGAH/10ZlIYugx6Ijn12qy3irmIC
hbMY6HWjvPlk8ZpAcPa3UXNQvqhTwqhSMXRiwp9aNPlRUqrXnDXZapQunJveKSAn
luLE8ISRKODz0W39qg6znyb4R1ipCGJWwjBCQmLWZuD7883JJ2DsykTATRx7yKQF
qsq9r/DPBTfD3vnOCTbqp0GeB80UFleTNm+K7cct8M1+WzfiwKeVHk9CAKy0fkTH
hS+YnV9UWYL6PR/w+uZ+2MfgH5er4X794+HaNbio0QJJbEZ2bsL4A3Prh7pUonN7
qJoCbT4W79scrnWQ40RbWRXOMfUk4J7gIMEZYar8z6NmqnamNZgxbWj3dv6pO+k=
=sz/L
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* SCSI scanner support
* fixes to qemu-char and net exit
* FreeBSD fixes
* Other small bugfixes
# gpg: Signature made Wed 13 Jul 2016 12:30:11 BST
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
hostmem: detect host backend memory is being used properly
hostmem: fix QEMU crash by 'info memdev'
char: do not use atexit cleanup handler
net: do not use atexit for cleanup
slirp: use exit notifier for slirp_smb_cleanup
tap: use an exit notifier to call down_script
util: Fix MIN_NON_ZERO
qemu-sockets: use qapi_free_SocketAddress in cleanup
disas: avoid including everything in headers compiled from C++
json-streamer: fix double-free on exiting during a parse
main-loop: check return value before using pointer
Use "-s" instead of "--quiet" to resolve non-fatal build error on FreeBSD.
scsi-bus: Use longer sense buffer with scanners
scsi-bus: Add SCSI scanner support
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ratelimit_calculate_delay() previously reset the accounting every time
slice, no matter how much data had been processed before. This had (at
least) two consequences:
1. The minimum speed is rather large, e.g. 5 MiB/s for commit and stream.
Not sure if there are real-world use cases where this would be a
problem. Mirroring and backup over a slow link (e.g. DSL) would
come to mind, though.
2. Tests for block job operations (e.g. cancel) were rather racy
All block jobs currently use a time slice of 100ms. That's a
reasonable value to get smooth output during regular
operation. However this also meant that the state of block jobs
changed every 100ms, no matter how low the configured limit was. On
busy hosts, qemu often transferred additional chunks until the test
case had a chance to cancel the job.
Fix the block job rate limit code to delay for more than one time
slice to address the above issues. To make it easier to handle
oversized chunks we switch the semantics from returning a delay
_before_ the current request to a delay _after_ the current
request. If necessary, this delay consists of multiple time slice
units.
Since the mirror job sends multiple chunks in one go even if the rate
limit was exceeded in between, we need to keep track of the start of
the current time slice so we can correctly re-compute the delay for
the updated amount of data.
The minimum bandwidth now is 1 data unit per time slice. The block
jobs are currently passing the amount of data transferred in sectors
and using 100ms time slices, so this translates to 5120
bytes/second. With chunk sizes usually being O(512KiB), tests have
plenty of time (O(100s)) to operate on block jobs. The chance of a
race condition now is fairly remote, except possibly on insanely
loaded systems.
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-id: 1467127721-9564-2-git-send-email-silbe@linux.vnet.ibm.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
In practice the entry argument is always known at creation time, and
it is confusing that sometimes qemu_coroutine_enter is used with a
non-NULL argument to re-enter a coroutine (this happens in
block/sheepdog.c and tests/test-coroutine.c). So pass the opaque value
at creation time, for consistency with e.g. aio_bh_new.
Mostly done with the following semantic patch:
@ entry1 @
expression entry, arg, co;
@@
- co = qemu_coroutine_create(entry);
+ co = qemu_coroutine_create(entry, arg);
...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);
@ entry2 @
expression entry, arg;
identifier co;
@@
- Coroutine *co = qemu_coroutine_create(entry);
+ Coroutine *co = qemu_coroutine_create(entry, arg);
...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);
@ entry3 @
expression entry, arg;
@@
- qemu_coroutine_enter(qemu_coroutine_create(entry), arg);
+ qemu_coroutine_enter(qemu_coroutine_create(entry, arg));
@ reentry @
expression co;
@@
- qemu_coroutine_enter(co, NULL);
+ qemu_coroutine_enter(co);
except for the aforementioned few places where the semantic patch
stumbled (as expected) and for test_co_queue, which would otherwise
produce an uninitialized variable warning.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
CoQueue do not need to remove any element but the head of the list;
processing is always strictly FIFO. Therefore, the simpler singly-linked
QSIMPLEQ can be used instead.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
And use it in qemu_dup_flags.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
MIN_NON_ZERO(1, 0) is evaluated to 0. Rewrite the macro to fix it.
Reported-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1468306113-847-1-git-send-email-famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXhP1vAAoJEDhwtADrkYZTSnsP/050pn1Zeo8tF7+C9Sj/8mu1
Peuaz7f+QkGzvkteoVu8W2dDNebUnjZI1InsBFcpJab8TCjnk+A2UipalTyWlX2U
QQbgqgLKTJgewVPMje+7B96D2Eyr4IhnhiaJDP7zBtx5tW+mO9NDqly7j9VJajbO
cBC9SBFLNEk3wQqiQMv4Ig6qi3a4jfU56Zhazt6ci7lJRKJwPnbhZjEWCKliRjzT
TjZsV8mlKm5Sp3wmahxGiWuUI+JnbneEnjLLvRBot5mUGZsA9oU4mgmzFBn+/uvp
rzvae41tfw1Td/S7nRBwTxZqRo8qiRSC5vNc90piBKkj7H1TaKnZUVBskkexZl4x
ibFOAVPx3yJredtCgjJZun2gc/EBpIQtcV9WNqY9qjwldczjim93ua/lM7BcTrbN
4rdNP63a81KYOkgL3rSd9zVarGybHIrhd96KUhX8j6SKbc7z4gVn90uMnAPsapk/
I39HR3smMvgNK3x6gZ4pU3Ghr2hmmWMah+gQ2ErczmoYArAixR8lDqFO+c2eoqfe
bHfZM6gsZickrJSL1t3/FXB0oXfcNifUzF5g2hDQY9WKDjFee4tb2+UGHcrb7PW8
Uq3eYqigNitj1COMksjijG0iBAfCqtJNPJDA4M0SLhuYVnH+cSGWYg3UWgETwu8q
nYZHU/nNqI9p3BNFkILR
=Zct/
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-include-2016-07-12' into staging
Clean up #include "..." vs <...> and header guards
# gpg: Signature made Tue 12 Jul 2016 15:23:43 BST
# gpg: using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-include-2016-07-12:
cris: Fix broken header guard in hw/cris/boot.h
Clean up decorations and whitespace around header guards
Clean up ill-advised or unusual header guards
libdecnumber: Don't error out on decNumberLocal.h re-inclusion
libdecnumber: Don't fool around with guards to avoid #include
Clean up header guards that don't match their file name
Drop Emacs local variables lists redundant with .dir-locals.el
spapr_pci: Include spapr.h instead of playing games with #error
tcg: Clean up tcg-target.h header guards
linux-user: Fix broken header guard in syscall_defs.h
linux-user: Clean up hostdep.h header guards
linux-user: Clean up target_structs.h header guards
linux-user: Clean up target_signal.h header guards
linux-user: Clean up target_cpu.h header guards
linux-user: Clean up target_syscall.h header guards
target-*: Clean up cpu.h header guards
scripts: New clean-header-guards.pl
Use #include "..." for our own headers, <...> for others
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Header guard symbols should match their file name to make guard
collisions less likely. Offenders found with
scripts/clean-header-guards.pl -vn.
Cleaned up with scripts/clean-header-guards.pl, followed by some
renaming of new guard symbols picked by the script to better ones.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tracked down with an ugly, brittle and probably buggy Perl script.
Also move includes converted to <...> up so they get included before
ours where that's obviously okay.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Add a documentation comment describing the functions for
converting between the cpu and little or bigendian formats.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1467908460-27048-6-git-send-email-peter.maydell@linaro.org
Now that all uses of cpu_to_*w() and *_to_cpup() have been replaced
with either ld*_p()/st*_p() or by doing direct dereferences and
using the cpu_to_*()/*_to_cpu() byteswap functions, we can remove
the unused implementations.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1467908460-27048-4-git-send-email-peter.maydell@linaro.org
The function is just a helper to handle the -global options, it
can stay in vl.c like most qemu_opts_foreach() calls.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Rather than rolling our own clone via an expensive conversion
in and back out of QObject, use the new clone visitor.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-15-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
iommus can not be added with -device.
cleanups and fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXe4l4AAoJECgfDbjSjVRpIz4IALye7mKG61/POA4Gqmhalc3d
HnlNSZ2YcKAuvPg7WWkBuRacrQvVY/MbW1mLloG1lY0tdFgZG8Cy+CY6wJg1NE4c
cXd+77vHkIyrnl+Nil+QOgTFiAsMnD+mXHHsnCDw2jGn3JbgVNuCMi7V34fGkQd2
PDkZyYfwTqO3HytuG0/j2Somc9du1gjYdn+9qigfZVgP96jGDojBuJWuuU5flCB3
Kj5xrOuI01XlbdTk71tVjBJBektQurWr6r7GECDqZIpUfc+BI70FU9jPh+OlLTO/
92yi29ncjyStz4tRnf18xoQ8uBgH/tD1xigEUPRtnm1+0i/tgONBL8cAdBF9FBE=
=ABGE
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio: new features, cleanups, fixes
iommus can not be added with -device.
cleanups and fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 05 Jul 2016 11:18:32 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (30 commits)
vmw_pvscsi: remove unnecessary internal msi state flag
e1000e: remove unnecessary internal msi state flag
vmxnet3: remove unnecessary internal msi state flag
mptsas: remove unnecessary internal msi state flag
megasas: remove unnecessary megasas_use_msi()
pci: Convert msi_init() to Error and fix callers to check it
pci bridge dev: change msi property type
megasas: change msi/msix property type
mptsas: change msi property type
intel-hda: change msi property type
usb xhci: change msi/msix property type
change pvscsi_init_msi() type to void
tests: add APIC.cphp and DSDT.cphp blobs
tests: acpi: add CPU hotplug testcase
log: Permit -dfilter 0..0xffffffffffffffff
range: Replace internal representation of Range
range: Eliminate direct Range member access
log: Clean up misuse of Range for -dfilter
pci_register_bar: cleanup
Revert "virtio-net: unbreak self announcement and guest offloads after migration"
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Range represents a range as follows. Member @start is the inclusive
lower bound, member @end is the exclusive upper bound. Zero @end is
special: if @start is also zero, the range is empty, else @end is to
be interpreted as 2^64. No other empty ranges may occur.
The range [0,2^64-1] cannot be represented. If you try to create it
with range_set_bounds1(), you get the empty range instead. If you try
to create it with range_set_bounds() or range_extend(), assertions
fail. Before range_set_bounds() existed, the open-coded creation
usually got you the empty range instead. Open deathtrap.
Moreover, the code dealing with the janus-faced @end is too clever by
half.
Dumb this down to a more pedestrian representation: members @lob and
@upb are inclusive lower and upper bounds. The empty range is encoded
as @lob = 1, @upb = 0.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Users of struct Range mess liberally with its members, which makes
refactoring hard. Create a set of methods, and convert all users to
call them instead of accessing members. The methods have carefully
worded contracts, and use assertions to check them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a macro that creates a 64bit value which has length number of ones
shifted across by the value of shift.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 9773244aa1c8c26b8b82cb261d8f5dd4b7b9fcf9.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Calling our function g_list_insert_sorted_merged is a misnomer,
since we are NOT writing a glib function. Furthermore, we are
making every caller pass the same comparator function of
range_merge(): any caller that would try otherwise would break
in weird ways since our internal call to ranges_can_merge() is
hard-coded to operate only on ranges, rather than paying
attention to the caller's comparator.
Better is to fix things so that callers don't have to care about
our internal comparator, by picking a function name and updating
the parameter type away from a gratuitous use of void*, to make
it obvious that we are operating specifically on a list of ranges
and not a generic list. Plus, refactoring the code here will
make it easier to plug a memory leak in the next patch.
range_compare() is now internal only, and moves to the .c file.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1464712890-14262-3-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
g_list_insert_sorted_merged() is rather large to be an inline
function; move it to its own file. range_merge() and
ranges_can_merge() can likewise move, as they are only used
internally. Also, it becomes obvious that the condition within
range_merge() is already satisfied by its caller, and that the
return value is not used.
The diffstat is misleading, because of the copyright boilerplate.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1464712890-14262-2-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
qemu leaves unix socket files behind when removing a listening chardev
or leaving. qemu could clean that up, even if doing so isn't race-free.
Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1347077
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1466105332-10285-4-git-send-email-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use socket_*() functions from include/qemu/sockets.h instead of
listen()/bind()/connect()/parse_host_port(). socket_*() fucntions are
QAPI based and this patch performs this api conversion since
everything will be using QAPI based sockets in the future. Also add a
helper function socket_address_to_string() in util/qemu-sockets.c
which returns the string representation of socket address. Thetask was
listed on http://wiki.qemu.org/BiteSizedTasks page.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
When qemu_set_log_filename() detects an invalid file name, it reports
an error, closes the log file (if any), and starts logging to stderr
(unless daemonized or nothing is being logged).
This is wrong. Asking for an invalid log file on the command line
should be fatal. Asking for one in the monitor should fail without
messing up an existing logfile.
Fix by converting qemu_set_log_filename() to Error. Pass it
&error_fatal, except for hmp_logfile report errors.
This also permits testing without a subprocess, so do that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1466011636-6112-4-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
g_error() is not an acceptable way to report errors to the user:
$ qemu-system-x86_64 -dfilter 1000+0
** (process:17187): ERROR **: Failed to parse range in: 1000+0
Trace/breakpoint trap (core dumped)
g_assert() isn't, either:
$ qemu-system-x86_64 -dfilter 1000x+64
**
ERROR:/work/armbru/qemu/util/log.c:180:qemu_set_dfilter_ranges: assertion failed: (e == range_op)
Aborted (core dumped)
Convert qemu_set_dfilter_ranges() to Error. Rework its deeply nested
control flow. Touch up the error messages. Call it with
&error_fatal.
This also permits testing without a subprocess, so do that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1466011636-6112-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
A half-shuffle operation takes a word with zeros in the high half:
0000 0000 0000 0000 ABCD EFGH IJKL MNOP
and spreads the bits out so they are in every other bit of the word:
0A0B 0C0D 0E0F 0G0H 0I0J 0K0L 0M0N 0O0P
A half-unshuffle performs the reverse operation.
Provide functions in bitops.h which implement these operations
for 32-bit and 64-bit inputs, and add tests for them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-3-git-send-email-peter.maydell@linaro.org
qemu/osdep.h checks whether MAP_ANONYMOUS is defined, but this check
is bogus without a previous inclusion of sys/mman.h. Include it in
sysemu/os-posix.h and remove it from everywhere else.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a fast, scalable chained hash table with optional auto-resizing, allowing
reads that are concurrent with reads, and reads/writes that are concurrent
with writes to separate buckets.
A hash table with these features will be necessary for the scalability
of the ongoing MTTCG work; before those changes arrive we can already
benefit from the single-threaded speedup that qht also provides.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-11-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Sometimes it is useful to have a quick histogram to represent a certain
distribution -- for example, when investigating a performance regression
in a hash table due to inadequate hashing.
The appended allows us to easily represent a distribution using Unicode
characters. Further, the data structure keeping track of the distribution
is so simple that obtaining its values for off-line processing is trivial.
Example, taking the last 10 commits to QEMU:
Characters in commit title Count
-----------------------------------
39 1
48 1
53 1
54 2
57 1
61 1
67 1
78 1
80 1
qdist_init(&dist);
qdist_inc(&dist, 39);
[...]
qdist_inc(&dist, 80);
char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS);
// -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0]
g_free(str);
char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS);
// -> [39.0,49.2)▁█▁▁[69.8,80.0]
g_free(str);
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-9-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Signed-off-by: Guillaume Delbergue <guillaume.delbergue@greensocs.com>
[Rewritten. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Emilio's additions: use TAS instead of atomic_xchg; emit acquire/release
barriers; return bool from trylock; call cpu_relax() while spinning;
optimize for uncontended locks by acquiring the lock with TAS instead
of TATAS; add qemu_spin_locked().]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-6-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Taken from the linux kernel.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-5-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
It is a more appropriate name, now that the mutex embedded
in the seqlock is gone.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-4-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This option is unused; besides, it bloats the struct when not needed.
Let's just let writers define their own locks elsewhere.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-3-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Remove glib.h includes, as it is provided by osdep.h.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Currently we emit a consume-load in atomic_rcu_read. Because of
limitations in current compilers, this is overkill for non-Alpha hosts
and it is only useful to make Thread Sanitizer work.
This patch leaves the consume-load in atomic_rcu_read when
compiling with Thread Sanitizer enabled, and resorts to a
relaxed load + smp_read_barrier_depends otherwise.
On an RMO host architecture, such as aarch64, the performance
improvement of this change is easily measurable. For instance,
qht-bench performs an atomic_rcu_read on every lookup. Performance
before and after applying this patch:
$ tests/qht-bench -d 5 -n 1
Before: 9.78 MT/s
After: 10.96 MT/s
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1464120374-8950-4-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For correctness, smp_read_barrier_depends() is only required to
emit a barrier on Alpha hosts. However, we are currently emitting
a consume fence unconditionally, and most compilers currently treat
consume and acquire fences as equivalent.
Fix it by keeping the consume fence if we're compiling with Thread
Sanitizer, since this might help prevent false warnings. Otherwise,
only emit the barrier for Alpha hosts. Note that we still guarantee
that smp_read_barrier_depends() is a compiler barrier.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1464120374-8950-3-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>