The FP-to-integer conversion instructions need to set CC 3 whenever
a "special case" occurs; this is the case whenever the instruction
also signals the IEEE invalid exception. (See e.g. figure 19-18
in the Principles of Operation.)
However, qemu currently will set CC 3 only in the case where the
input was a NaN. This is indeed one of the special cases, but
there are others, most notably the case where the input is out
of range of the target data type.
This patch fixes the problem by switching these instructions to
the "static" CC method and computing the correct result directly
in the helper. (It cannot be re-computed later as the information
about the invalid exception is no longer available.)
This fixes a bug observed when running the wasmtime test suite
under the s390x-linux-user target.
Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210630105058.GA29130@oc3748833570.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
For IEEE functions, we can reuse the softfloat implementations. For the
other functions, implement it generically for 32bit/64bit/128bit -
carefully taking care of all weird special cases according to the tables
defined in the PoP.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-24-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
128 bit -> 64 bit, there is only a single element to process.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-19-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
64 bit -> 128 bit, there is only a single final element.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-18-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
In addition to 32/128bit variants, we also have to support the
"Signal-on-QNaN (SQ)" bit.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-16-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
In case of 128bit, we always have a single element. Add new helpers for
reading/writing 32/128 bit floats.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-14-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Pass the m5 field via simd_data() and don't provide specialized handlers
for single-element variants.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's rework our macros and simplify. We still need helper functions in
most cases due to the different parameters types.
Next, we'll only have 32/128bit variants for vfi and vfsq, so special
case the others.
Note that for vfsq, the XxC and erm passed in the simd_data() will never be
set, resulting in the same behavior.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's simplify, reworking our handler generation, passing the whole "m5"
register content and not providing specialized handlers for "se", and
reading/writing proper float64 values using new helpers.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-4-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The PoP states:
When EDAT-1 does not apply, and a program interruption due to a
page-translation exception is recognized by the MOVE PAGE
instruction, the contents of the R1 field of the instruction are
stored in bit positions 0-3 of location 162, and the contents of
the R2 field are stored in bit positions 4-7.
If [...] an ASCE-type, region-first-translation,
region-second-translation, region-third-translation, or
segment-translation exception was recognized, the contents of
location 162 are unpredictable.
So we have to write r1/r2 into the lowcore on page-translation
exceptions. Simply handle all exceptions inside our mvpg helper now.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210315085449.34676-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Recent upstream Linux uses the MONITOR CALL instruction for things like
BUG_ON() and WARN_ON(). We currently inject an operation exception when
we hit a MONITOR CALL instruction - which is wrong, as the instruction
is not glued to specific CPU features.
Doing a simple WARN_ON_ONCE() currently results in a panic:
[ 18.162801] illegal operation: 0001 ilc:2 [#1] SMP
[ 18.162889] Modules linked in:
[...]
[ 18.165476] Kernel panic - not syncing: Fatal exception: panic_on_oops
With a proper implementation, we now get:
[ 18.242754] ------------[ cut here ]------------
[ 18.242855] WARNING: CPU: 7 PID: 1 at init/main.c:1534 [...]
[ 18.242919] Modules linked in:
[...]
[ 18.246262] ---[ end trace a420477d71dc97b4 ]---
[ 18.259014] Freeing unused kernel memory: 4220K
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200918085122.26132-1-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Merge VERLL and VERLLV into op_vesv and op_ves, alongside
all of the other vector shift operations.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These are trivially done by performing a memory operation
with the correct mmu_idx. The only tricky part is using
get_address directly in order to get the address wrapped;
we cannot use la2 because of the format.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20191211203614.15611-3-richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Split the PER handling for store-to-real-address into its
own helper function, conditionally called when PER is
enabled, just as we do for per_branch and per_ifetch.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20191211203614.15611-2-richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
24 and 31-bit address space handling is wrong when it comes to storing
back the addresses to the register.
While at it, read gprs 0 implicitly.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Simulate XxC=0 and ERM=0 (current mode), so we can use the existing
helper function.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can reuse some of the infrastructure introduced for
VECTOR FP CONVERT FROM FIXED 64-BIT and friends.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Take care of reading/indicating the 32-bit elements.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can reuse most of the infrastructure introduced for
VECTOR FP CONVERT FROM FIXED 64-BIT and friends.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can reuse most of the infrastructure added for VECTOR FP ADD.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
1. We'll reuse op_vcdg() for similar instructions later, prepare for
that.
2. We'll reuse vop64_2() later for other instructions.
We have to mangle the erm (effective rounding mode) and the m4 into
the simd_data(), and properly unmangle them again.
Make sure to restore the erm before triggering an exception.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Provide for all three instructions all four combinations of cc bit and
s bit.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
As far as I can see, there is only a tiny difference.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
1. We'll reuse op_vfa() for similar instructions later, prepare for
that.
2. We'll reuse vop64_3() for other instructions later.
3. Take care of modifying the vector register only if no trap happened.
- on traps, flags are not updated and no elements are modified
- traps don't modify the fpc flags
- without traps, all exceptions of all elements are merged
4. We'll reuse check_ieee_exc() later when we need the XxC flag.
We have to check for exceptions after processing each element.
Provide separate handlers for single/all element processing. We'll do
the same for all applicable FP instructions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Unfortunately, there is no easy way to avoid looping over all elements
in v2. Provide specialized variants for !cc,!rt/!cc,rt/cc,!rt/cc,rt and
all element types. Especially for different values of rt, the compiler
might be able to optimize the code a lot.
Add s390_vec_write_element().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR FIND ELEMENT EQUAL. Core logic courtesy of Richard H.
Add s390_vec_read_element() that can deal with element sizes.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Complicated stuff. Provide two different helpers for CC an !CC handling.
We might want to add more helpers later.
zero_search() and match_index() are courtesy of Richard H.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's return the cc value directly via cpu_env. Unfortunately there
isn't a simple way to calculate the value lazily - one would have to
calculate and store e.g. the population count of the mask and the
result so it can be evaluated in a cc helper.
But as VTM only sets the cc, we can assume the value will be needed soon
either way.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's keep it simple for now and handle 8/16 bit elements via helpers.
Especially for 8/16, we could come up with some bit tricks.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR SHIFT RIGHT ARITHMETICAL.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR SHIFT LEFT ARITHMETIC. Add s390_vec_sar() similar to
s390_vec_shr().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>