The new paging more is extension of IA32e mode with more additional page
table level.
It brings support of 57-bit vitrual address space (128PB) and 52-bit
physical address space (4PB).
The structure of new page table level is identical to pml4.
The feature is enumerated with CPUID.(EAX=07H, ECX=0):ECX[bit 16].
CR4.LA57[bit 12] need to be set when pageing enables to activate 5-level
paging mode.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Message-Id: <20161215001305.146807-1-kirill.shutemov@linux.intel.com>
[Drop changes to target-i386/translate.c. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The syscall and sysret instructions behave a bit differently:
TF is checked after the instruction completes.
This allows the o/s to disable #DB at a syscall by adding TF to FMASK.
And then when the sysret is executed the #DB is taken "as if" the
syscall insn just completed.
Signed-off-by: Doug Evans <dje@google.com>
Message-Id: <94eb2c0bfa1c6a9fec0543057483@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Check for KVM_CAP_ADJUST_CLOCK capability KVM_CLOCK_TSC_STABLE, which
indicates that KVM_GET_CLOCK returns a value as seen by the guest at
that moment.
For new machine types, use this value rather than reading
from guest memory.
This reduces kvmclock difference on migration from 5s to 0.1s
(when max_downtime == 5s).
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20161121105052.598267440@redhat.com>
[Add comment explaining what is going on. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a scsi-disk object receives VERIFY command with BYTCHK bit being zero,
scsi_block_is_passthrough returns false and finally makes req being proceeded
by scsi_block_dma_command. Because scsi_block_dma_command has removed process
of VERIFY, QEMU will abort in this function.
Reported-by: Junlian Bell <zhongjun@sangfor.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The patch is to fix the confusing assert fail message caused by
un-initialized device structure (from bite sized tasks).
The bug can be reproduced by
./qemu-system-x86_64 -nographic -device cfi.pflash01
The CFI hardware is dynamically loaded by QOM realizing mechanism,
however the realizing function in pflash_cfi01_realize function
requires the device being initialized manually before calling, like
./qemu-system-x86_64 -nographic
-device cfi.pflash01,num-blocks=1024,sector-length=4096,name=testcard
Once the initializing parameters are left off in the command, it will
leave the device structure not initialized, which makes
pflash_cfi01_realize try to realize a zero-volume card, causing
/mnt/EXT_volume/projects/qemu/qemu-dev/exec.c:1378:
find_ram_offset: Assertion `size != 0\' failed.
Through my test, at least the flash device's block-number, sector-length
and its name is needed for pflash_cfi01_realize to behave correctly. So
I think the new asserts are needed to hint the QEMU user to specify
the device's parameters correctly.
Signed-off-by: Ziyue Yang <skiver.cloud.yzy@gmail.com>
Message-Id: <1481810693-13733-1-git-send-email-skiver.cloud.yzy@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ziyue Yang <yzylivezh@hotmail.com>
get_opt_value() truncates the value at the first comma
Use memcpy() instead so that -append works correctly in the
presence of commas. For -initrd to work right, instead,
unescape the module filename and parameters with get_opt_value()
before calling mb_add_cmdline().
Signed-off-by: Vlad Lungu <vlad.lungu@windriver.com>
Message-Id: <1481805124-16242-1-git-send-email-vlad.lungu@windriver.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The remote protocol can't handle flipping back and forth
between 32-bit and 64-bit regs. To compensate, pretend "as if"
on 64-bit cpu when in 32-bit mode.
Signed-off-by: Doug Evans <dje@google.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <001a113dca8274572005406e03c3@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 87f68d3182 (block: drop aio
functions that operate on the main AioContext) drops qemu_aio_wait
function references mostly while leaves these behind, clean up them.
Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Message-Id: <1480566640-27264-3-git-send-email-baiyaowei@cmss.chinamobile.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 49cf57281b (vl: delay thread initialization after daemonization)
makes the global mutex is taken after daemonization instead before
daemonization by qemu_init_main_loop().
Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Message-Id: <1480566640-27264-2-git-send-email-baiyaowei@cmss.chinamobile.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's timer to expire, not clock.
Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Message-Id: <1480566640-27264-1-git-send-email-baiyaowei@cmss.chinamobile.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This avoids taking the active_timers_lock or resetting/setting the
timers_done_ev if there are no active timers. This removes a small
(2-3%) source of overhead for dataplane. The list is then checked
again inside the lock, or a NULL pointer could be dereferenced.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These will be used more as soon as the acquire/release is pushed down to
the ioeventfd handlers.
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Libtool support was removed in commit e999ee4434, there is a few
left-over.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161108070513.30274-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Really rule chaining is not a particularly expensive task, since
GNU Make caches the directory listing. However it is easy to
avoid it for most files and for phony targets (one was missing).
After this patch, only "Makefile", "scripts/hxtool" and
"scripts/create_config" attempt to use chained rules.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Unnesting variables spends a lot of time parsing and executing foreach
and if functions. Because actually very few variables have to be
saved and restored, a good strategy is to remember what has to be done
in load-vars, and only iterate the right variables in load-vars.
For save-vars, unroll the foreach loop to provide another small
improvement.
This speeds up a "noop" build from around 15.5 seconds on my laptop
to 11.7 (25% roughly).
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the Intel 6300ESB watchdog is hot unplug. The timer allocated
in realize isn't freed thus leaking memory leak. This patch avoid
this through adding the exit function.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-Id: <583cde9c.3223ed0a.7f0c2.886e@mx.google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Device models often have to perform multiple access to a single
memory region that is known in advance, but would to use "DMA-style"
functions instead of address_space_map/unmap. This can happen
for example when the data has to undergo endianness conversion.
Introduce a new data structure to cache the result of
address_space_translate without forcing usage of a host address
like address_space_map does.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This extracts the common part of address_space_map and
address_space_cache_init into a new function.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Templatize the address_space_* and *_phys functions, so that we can add
similar functions in the next patch that work with a lightweight,
cache-like version of address_space_map/unmap.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do them right before the next patch generalizes them into a multi-included
file.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch add nettle-backed HMAC algorithms support
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch add glib-backed HMAC algorithms support
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch add HMAC algorithms based on libgcrypt support
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch introduce HMAC algorithms framework.
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This item will be used for support libcrypt-backed HMAC algorithms.
Support for hmac has been added in Libgcrypt 1.6.0, but we cannot
use pkg-config to get libcrypt's version. However we can make a
in configure to know whether current libcrypt support hmac.
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Libgcrypt and nettle support 3des-ede, so this patch add 3des-ede
support when using libgcrypt or nettle.
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
On error path, ctx may be leaked. Assign ctx earlier, and call
qcrypto_cipher_free() on error.
Spotted thanks to ASAN.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The blocksize option is defined in RFC 1783 and RFC 2348.
We now support block sizes between 1 and 1428 bytes, instead of 512 only.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
We've currently got 18 architectures in QEMU, and thus 18 target-xxx
folders in the root folder of the QEMU source tree. More architectures
(e.g. RISC-V, AVR) are likely to be included soon, too, so the main
folder of the QEMU sources slowly gets quite overcrowded with the
target-xxx folders.
To disburden the main folder a little bit, let's move the target-xxx
folders into a dedicated target/ folder, so that target-xxx/ simply
becomes target/xxx/ instead.
Acked-by: Laurent Vivier <laurent@vivier.eu> [m68k part]
Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> [tricore part]
Acked-by: Michael Walle <michael@walle.cc> [lm32 part]
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x part]
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [s390x part]
Acked-by: Eduardo Habkost <ehabkost@redhat.com> [i386 part]
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com> [sparc part]
Acked-by: Richard Henderson <rth@twiddle.net> [alpha part]
Acked-by: Max Filippov <jcmvbkbc@gmail.com> [xtensa part]
Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [ppc part]
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> [crisµblaze part]
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn> [unicore32 part]
Signed-off-by: Thomas Huth <thuth@redhat.com>
This patch makes virtio-gpu track host memory allocations for ressources
and applies a limit (configurable 256M by default). When exceeding the
limit virtio-gpu throws VIRTIO_GPU_RESP_ERR_OUT_OF_MEMORY errors (like
it already does today when pixman image allocations fail).
This patch covers 2d mode only. For 3d mode we have to figure how we
are going to handle this best. qemu doesn't track resources in case
virglrenderer is used, so I guess we should extend virglrenderer to
allow setting a limit, then let qemu set the limit and catch
virgl_renderer_resource_create failures.
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1480423356-22255-1-git-send-email-kraxel@redhat.com
Virtio GPU device while processing 'VIRTIO_GPU_CMD_GET_CAPSET'
command, retrieves the maximum capabilities size to fill in the
response object. It continues to fill in capabilities even if
retrieved 'max_size' is zero(0), thus resulting in OOB access.
Add check to avoid it.
Reported-by: Zhenhao Hong <zhenhaohong@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20161214070156.23368-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Because guest mask notifier cannot be used in vhost-user mode, a boolean
flag "use_guest_notifier_mask" was added in commit 5669655aaf to disable
the use of guest mask notifier under virtio-pci. However this flag wasn't
checked in other virtio devices, such as virtio-mmio. In our tests, it
caused assertion error under "vhost-user + virtio-mmio". This patch
addresses this problem by adding a check before guest_notifier_mask is
called.
Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PCI Express downstream slot has a single PCI slot
behind it, using PCI_DEVFN(PCI_SLOT(devfn), 0)
does not give you function 0 in cases such as ARI
as well as some error cases.
This is exactly what we are hitting:
$ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg
-monitor stdio
(qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
(qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
Segmentation fault (core dumped)
The fix is to use the pci_get_function_0 API.
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Tested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
IOMMU MMIO registers are divided in two groups by their offsets.
Low offsets(<0x2000) registers are grouped into 'amdvi_mmio_low'
table and higher offsets(>=0x2000) registers are grouped into
'amdvi_mmio_high' table. No of registers in each table is given
by macro 'AMDVI_MMIO_REGS_LOW' and 'AMDVI_MMIO_REGS_HIGH' resp.
Values of these two macros were swapped, resulting in an OOB
access when reading 'amdvi_mmio_high' table. Correct these two
macros. Also read from 'amdvi_mmio_low' table for lower address.
Reported-by: Azureyang <azureyang@tencent.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the libvhost-user library.
This ended up being a rather large patch that cannot be easily splitted,
due to massive code move and API changes.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a library to help implementing vhost-user backend (or slave).
Dealing with vhost-user as an application developer isn't so easy: you
have all the trouble with any protocol: validation, unix ancillary data,
shared memory, eventfd, logging, and on top of that you need to deal
with virtio queues, if possible efficiently.
qemu test has a nice vhost-user testing application vhost-user-bridge,
which implements most of vhost-user, and virtio.c which implements
virtqueues manipulation. Based on these two, I tried to make a simple
library, reusable for tests or development of new vhost-user scenarios.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[Felipe: set used_idx copy on SET_VRING_ADDR and update shadow avail idx
on SET_VRING_BASE]
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The call fd is not watched
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>