Instead of splitting at an unaligned address, we can simply split at
4TB.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
s390 was trying to solve limited KVM memslot size issue by abusing
memory_region_allocate_system_memory(), which breaks API contract
where the function might be called only once.
Beside an invalid use of API, the approach also introduced migration
issue, since RAM chunks for each KVM_SLOT_MAX_BYTES are transferred in
migration stream as separate RAMBlocks.
After discussion [1], it was agreed to break migration from older
QEMU for guest with RAM >8Tb (as it was relatively new (since 2.12)
and considered to be not actually used downstream).
Migration should keep working for guests with less than 8TB and for
more than 8TB with QEMU 4.2 and newer binary.
In case user tries to migrate more than 8TB guest, between incompatible
QEMU versions, migration should fail gracefully due to non-exiting
RAMBlock ID or RAMBlock size mismatch.
Taking in account above and that now KVM code is able to split too
big MemorySection into several memslots, partially revert commit
(bb223055b s390-ccw-virtio: allow for systems larger that 7.999TB)
and use kvm_set_max_memslot_size() to set KVMSlot size to
KVM_SLOT_MAX_BYTES.
1) [PATCH RFC v2 4/4] s390: do not call memory_region_allocate_system_memory() multiple times
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20190924144751.24149-5-imammedo@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Each operand can have a maximum length of 16. Make sure to prepare all
reads/writes before writing.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Access at most single pages and document why. Using the access helpers
might over-indicate watchpoints within the same page, I guess we can
live with that.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can process a maximum of 256 bytes, crossing two pages.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can process a maximum of 256 bytes, crossing two pages.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can process a maximum of 256 bytes, crossing two pages. Calculate the
accessed range upfront - src is accessed right-to-left.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can process a maximum of 256 bytes, crossing two pages.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can process a maximum of 256 bytes, crossing two pages. While at it,
increment the length once.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can process a maximum of 256 bytes, crossing two pages.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
The last remaining bit is padding with two bytes.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
The last remaining bit for MVC is handling destructive overlaps in a
fault-safe way.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
As we are moving between address spaces, we can use access_memmove()
without checking for destructive overlaps (especially of real storage
locations):
"Each storage operand is processed left to right. The
storage-operand-consistency rules are the same as
for MOVE (MVC), except that when the operands
overlap in real storage, the use of the common real-
storage locations is not necessarily recognized."
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Replace fast_memmove() variants by access_memmove() variants, that
first try to probe access to all affected pages (maximum is two pages).
Introduce access_get_byte()/access_set_byte(). We might be able to speed
up memmove in special cases even further (do single-byte access, use
memmove() for remaining bytes in page), however, we'll skip that for now.
In MVCOS, simply always call access_memmove_as() and drop the TODO
about LAP. LAP is already handled in the MMU.
Get rid of adj_len_to_page(), which is now unused.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Replace fast_memset() by access_memset(), that first tries to probe
access to all affected pages (maximum is two). We'll use the same
mechanism for other types of accesses soon.
Only in very rare cases (especially TLB_NOTDIRTY), we'll have to
fallback to ld/st helpers. Try to speed up that case as suggested by
Richard.
We'll rework most involved handlers soon to do all accesses via new
fault-safe helpers, especially MVC.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Although we basically ignore the index all the time for CONFIG_USER_ONLY,
let's simply skip all the checks and always return MMU_USER_IDX in
cpu_mmu_index() and get_mem_index().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
24 and 31-bit address space handling is wrong when it comes to storing
back the addresses to the register.
While at it, read gprs 0 implicitly.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Bit position 32-55 of general register 0 must be zero.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
... and don't perform any move in case the length is zero.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Triggered by a review comment from Richard, also MVCOS has a 32-bit
length in 24/31-bit addressing mode. Add a new helper.
Rename wrap_length() to wrap_length31().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's stay within single pages.
... and indicate cc=3 in case there is work remaining. Keep unicode
padding simple.
While reworking, properly wrap the addresses.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We have to mask of any unused bits. While at it, document what exactly is
missing.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Perform the checks documented in the PoP.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's use the new helper, that also detects destructive overlaps when
wrapping.
We'll make the remaining code (e.g., fast_memmove()) aware of wrapping
later.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's increment the length once.
While at it, cleanup the comment. The memset() example is given as a
programming note in the PoP, so drop the description.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Process max 4k bytes at a time, writing back registers between the
accesses. The instruction is interruptible.
"For operands longer than 2K bytes, access exceptions are not
recognized for locations more than 2K bytes beyond the current location
being processed."
Note that on z/Architecture, 2k vs. 4k access cannot get differentiated as
long as pages are not crossed. This seems to be a leftover from ESA/390.
Simply stay within single pages.
MVCL handling is quite different than MVCLE/MVCLU handling, so split up
the handlers.
Defer interrupt handling, as that will require more thought, add a TODO
for that.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We'll have to zero-out unused bit positions, so make sure to write the
addresses back.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We have to zero out unused bits in 24 and 31-bit addressing mode.
Provide a new helper.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We use the marker "-1" for "no exception". s390_cpu_do_interrupt() might
get confused by that.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Since QEMU v2.10, the KVM acceleration does not work on older kernels
anymore since the code accidentally requires the KVM_CAP_DEVICE_CTRL
capability now - it should have been optional instead.
Instead of fixing the bug, we asked in the ChangeLog of QEMU 2.11 - 3.0
that people should speak up if they still need support of QEMU running
with KVM on older kernels, but seems like nobody really complained.
Thus let's make this official now and turn it into a proper error
message, telling the users to use at least kernel 3.15 now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190913091443.27565-1-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
... and also call it for CONFIG_USER_ONLY. This function probably will
also need some refactoring in regards to probing, however, we'll have to
come back to that later, once cleaning up the other mem helpers.
The alignment check always makes sure that the write access falls into a
single page.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190826075112.25637-8-david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Factor it out into common code. Similar to the !CONFIG_USER_ONLY variant,
let's not allow to cross page boundaries.
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190826075112.25637-4-david@redhat.com>
[rth: Move cpu & cc variables inside if block.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Hm... how did that "-" slip in (-TAGRET_PAGE_SIZE would be correct). This
currently makes us exceed one page in a single probe_write() call,
essentially leaving some memory unchecked.
Fixes: c5a7392cfb ("s390x/tcg: Provide probe_write_access helper")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190826075112.25637-3-david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If I'm not completely wrong, we are dealing with guest addresses here
and not with host addresses. Use the right check.
Fixes: c5a7392cfb ("s390x/tcg: Provide probe_write_access helper")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190826075112.25637-2-david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Preparation for collapsing the two byte swaps, adjust_endianness and
handle_bswap, along the I/O path.
Target dependant attributes are conditionalized upon NEED_CPU_H.
Signed-off-by: Tony Nguyen <tony.nguyen@bt.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <81d9cd7d7f5aaadfa772d6c48ecee834e9cf7882.1566466906.git.tony.nguyen@bt.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Factor it out, add a comment how it all works, and also use it in the
REAL MMU.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190816084708.602-7-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Any access sets the reference bit. In case we have a read-fault, we
should not allow writes to the TLB entry if the change bit was not
already set.
This is a preparation for proper storage-key reference/change bit handling
in TCG and a fix for KVM whereby read accesses would set the change
bit (old KVM versions without the ioctl to carry out the translation).
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190816084708.602-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Whenever we modify a storage key, we should flush the TLBs of all CPUs,
so the MMU fault handling code can properly consider the changed storage
key (to e.g., properly set the reference and change bit on the next
accesses).
These functions are barely used in modern Linux guests, so the performance
implications are neglectable for now.
This is a preparation for better reference and change bit handling for
TCG, which will require more MMU changes.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190816084708.602-5-david@redhat.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Instructions are always fetched from primary address space, except when
in home address mode. Perform the selection directly in cpu_mmu_index().
get_mem_index() is only used to perform data access, instructions are
fetched via cpu_lduw_code(), which translates to cpu_mmu_index(env, true).
We don't care about restricting the access permissions of the TLB
entries anymore, as we no longer enter PRIMARY entries into the
SECONDARY MMU. Cleanup related code a bit.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190816084708.602-4-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's select the ASC before calling the function. This is a prepararion
to remove the ASC magic depending on the access mode from mmu_translate.
There is currently no way to distinguish if we have code or data access.
For now, we were using code access, because especially when debugging with
the gdbstub, we want to read and disassemble what we single-step.
Note: KVM guest can now no longer be crashed using qmp/hmp/gdbstub if they
happen to be in AR mode.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190816084708.602-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We want to trace the actual return value, not "0".
Fixes: 0f5f669147 ("s390x: Enable new s390-storage-keys device")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190816084708.602-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Wrong order of operands. The constant always comes last. Makes QEMU crash
reliably on specific git fetch invocations.
Reported-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190814151242.27199-1-david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Fixes: 5c4b0ab460 ("s390x/tcg: Implement VECTOR ELEMENT ROTATE AND INSERT UNDER MASK")
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190709152053.16670-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Rebased onto merge commit 95a9457fd44; missed instances of qom/cpu.h
in comments replaced]
Generally the cpu and non-FP helper files just want to manipulate the
softfloat flags. For this they can just use the -helpers.h include
which brings in a minimal number of inline helpers.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
sysemu/sysemu.h is a rather unfocused dumping ground for stuff related
to the system-emulator. Evidence:
* It's included widely: in my "build everything" tree, changing
sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600
objects (not counting tests and objects that don't depend on
qemu/osdep.h, down from 5400 due to the previous two commits).
* It pulls in more than a dozen additional headers.
Split stuff related to run state management into its own header
sysemu/runstate.h.
Touching sysemu/sysemu.h now recompiles some 850 objects. qemu/uuid.h
also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400
to 4200. Touching new sysemu/runstate.h recompiles some 500 objects.
Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also
add qemu/main-loop.h.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-30-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
[Unbreak OS-X build]
In my "build everything" tree, changing qemu/main-loop.h triggers a
recompile of some 5600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h). It includes block/aio.h,
which in turn includes qemu/event_notifier.h, qemu/notify.h,
qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h,
qemu/thread.h, qemu/timer.h, and a few more.
Include qemu/main-loop.h only where it's needed. Touching it now
recompiles only some 1700 objects. For block/aio.h and
qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the
others, they shrink only slightly.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-21-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).
The previous commits have left only the declaration of hw_error() in
hw/hw.h. This permits dropping most of its inclusions. Touching it
now recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>