Commit Graph

110 Commits

Author SHA1 Message Date
David Hildenbrand e384332cb5 s390x/tcg: Implement 32/128 bit for VECTOR FP COMPARE *
In addition to 32/128bit variants, we also have to support the
"Signal-on-QNaN (SQ)" bit.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-16-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-06-21 08:48:21 +02:00
David Hildenbrand acb269a4cd s390x/tcg: Implement 32/128 bit for VECTOR (LOAD FP INTEGER|FP SQUARE ROOT)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-15-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-06-21 08:48:21 +02:00
David Hildenbrand 0987961da9 s390x/tcg: Implement 32/128 bit for VECTOR FP (ADD|DIVIDE|MULTIPLY|SUBTRACT)
In case of 128bit, we always have a single element. Add new helpers for
reading/writing 32/128 bit floats.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-14-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-06-21 08:48:21 +02:00
David Hildenbrand 2a785dfb50 s390x/tcg: Implement VECTOR BIT PERMUTE
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-12-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-06-21 08:48:21 +02:00
David Hildenbrand 977e43d977 s390x/tcg: Simplify vflr64() handling
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-10-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-06-21 08:48:21 +02:00
David Hildenbrand 860b707bbb s390x/tcg: Simplify vfll32() handling
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-9-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-06-21 08:48:21 +02:00
David Hildenbrand 34142ffdee s390x/tcg: Simplify vfma64() handling
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-06-21 08:48:21 +02:00
David Hildenbrand 622ebe64ad s390x/tcg: Simplify vftci64() handling
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-7-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-06-21 08:48:21 +02:00
David Hildenbrand 64deb65afe s390x/tcg: Simplify vfc64() handling
Pass the m5 field via simd_data() and don't provide specialized handlers
for single-element variants.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-06-21 08:48:21 +02:00
David Hildenbrand 21bd6ea2b3 s390x/tcg: Simplify vop64_2() handling
Let's rework our macros and simplify. We still need helper functions in
most cases due to the different parameters types.

Next, we'll only have 32/128bit variants for vfi and vfsq, so special
case the others.

Note that for vfsq, the XxC and erm passed in the simd_data() will never be
set, resulting in the same behavior.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-06-21 08:48:21 +02:00
David Hildenbrand 863b9507a6 s390x/tcg: Simplify vop64_3() handling
Let's simplify, reworking our handler generation, passing the whole "m5"
register content and not providing specialized handlers for "se", and
reading/writing proper float64 values using new helpers.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210608092337.12221-4-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-06-21 08:48:21 +02:00
David Hildenbrand 1a3c443c43 target/s390x: Store r1/r2 for page-translation exceptions during MVPG
The PoP states:

    When EDAT-1 does not apply, and a program interruption due to a
    page-translation exception is recognized by the MOVE PAGE
    instruction, the contents of the R1 field of the instruction are
    stored in bit positions 0-3 of location 162, and the contents of
    the R2 field are stored in bit positions 4-7.

    If [...] an ASCE-type, region-first-translation,
    region-second-translation, region-third-translation, or
    segment-translation exception was recognized, the contents of
    location 162 are unpredictable.

So we have to write r1/r2 into the lowcore on page-translation
exceptions. Simply handle all exceptions inside our mvpg helper now.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210315085449.34676-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-03-15 11:03:20 +01:00
David Hildenbrand 20d143e2ca s390x/tcg: Implement MONITOR CALL
Recent upstream Linux uses the MONITOR CALL instruction for things like
BUG_ON() and WARN_ON(). We currently inject an operation exception when
we hit a MONITOR CALL instruction - which is wrong, as the instruction
is not glued to specific CPU features.

Doing a simple WARN_ON_ONCE() currently results in a panic:
  [   18.162801] illegal operation: 0001 ilc:2 [#1] SMP
  [   18.162889] Modules linked in:
  [...]
  [   18.165476] Kernel panic - not syncing: Fatal exception: panic_on_oops

With a proper implementation, we now get:
  [   18.242754] ------------[ cut here ]------------
  [   18.242855] WARNING: CPU: 7 PID: 1 at init/main.c:1534 [...]
  [   18.242919] Modules linked in:
  [...]
  [   18.246262] ---[ end trace a420477d71dc97b4 ]---
  [   18.259014] Freeing unused kernel memory: 4220K

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200918085122.26132-1-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-10-02 13:52:49 +02:00
Richard Henderson cea94ba36d target/s390x: Use tcg_gen_gvec_rotl{i,s,v}
Merge VERLL and VERLLV into op_vesv and op_ves, alongside
all of the other vector shift operations.

Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2020-06-02 08:42:37 -07:00
Richard Henderson 5e34df7cc9 target/s390x: Implement LOAD/STORE TO REAL ADDRESS inline
These are trivially done by performing a memory operation
with the correct mmu_idx.  The only tricky part is using
get_address directly in order to get the address wrapped;
we cannot use la2 because of the format.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20191211203614.15611-3-richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-12-18 12:57:29 +01:00
Richard Henderson ebed683c4e target/s390x: Split out helper_per_store_real
Split the PER handling for store-to-real-address into its
own helper function, conditionally called when PER is
enabled, just as we do for per_branch and per_ifetch.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20191211203614.15611-2-richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-12-18 12:57:29 +01:00
David Hildenbrand 2bb525e20d s390x/tcg: MVST: Fix storing back the addresses to registers
24 and 31-bit address space handling is wrong when it comes to storing
back the addresses to the register.

While at it, read gprs 0 implicitly.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-09-23 09:28:29 +02:00
David Hildenbrand 83b955f9a8 s390x/tcg: Implement VECTOR FP TEST DATA CLASS IMMEDIATE
We can reuse float64_dcmask().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:26 +02:00
David Hildenbrand 658a395f6c s390x/tcg: Implement VECTOR FP SUBTRACT
Similar to VECTOR FP ADD.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:26 +02:00
David Hildenbrand 5938f20cb8 s390x/tcg: Implement VECTOR FP SQUARE ROOT
Simulate XxC=0 and ERM=0 (current mode), so we can use the existing
helper function.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:26 +02:00
David Hildenbrand c64c598402 s390x/tcg: Implement VECTOR FP MULTIPLY AND (ADD|SUBTRACT)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:26 +02:00
David Hildenbrand 8d47d4d212 s390x/tcg: Implement VECTOR FP MULTIPLY
Very similar to VECTOR FP DIVIDE.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:26 +02:00
David Hildenbrand 4500ede452 s390x/tcg: Implement VECTOR LOAD ROUNDED
We can reuse some of the infrastructure introduced for
VECTOR FP CONVERT FROM FIXED 64-BIT and friends.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 1a76e59da3 s390x/tcg: Implement VECTOR LOAD LENGTHENED
Take care of reading/indicating the 32-bit elements.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 60d0ab29a1 s390x/tcg: Implement VECTOR LOAD FP INTEGER
We can reuse most of the infrastructure introduced for
VECTOR FP CONVERT FROM FIXED 64-BIT and friends.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 817a1cec89 s390x/tcg: Implement VECTOR FP DIVIDE
We can reuse most of the infrastructure added for VECTOR FP ADD.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 09c04e4b88 s390x/tcg: Implement VECTOR FP CONVERT TO LOGICAL 64-BIT
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 35b3bb1c55 s390x/tcg: Implement VECTOR FP CONVERT TO FIXED 64-BIT
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 9b8d1a387d s390x/tcg: Implement VECTOR FP CONVERT FROM LOGICAL 64-BIT
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand bb03fd841c s390x/tcg: Implement VECTOR FP CONVERT FROM FIXED 64-BIT
1. We'll reuse op_vcdg() for similar instructions later, prepare for
   that.
2. We'll reuse vop64_2() later for other instructions.

We have to mangle the erm (effective rounding mode) and the m4 into
the simd_data(), and properly unmangle them again.

Make sure to restore the erm before triggering an exception.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 2c806ab443 s390x/tcg: Implement VECTOR FP COMPARE (EQUAL|HIGH|HIGH OR EQUAL)
Provide for all three instructions all four combinations of cc bit and
s bit.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 5b89f0fba2 s390x/tcg: Implement VECTOR FP COMPARE (AND SIGNAL) SCALAR
As far as I can see, there is only a tiny difference.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 3a0eae8546 s390x/tcg: Implement VECTOR FP ADD
1. We'll reuse op_vfa() for similar instructions later, prepare for
   that.
2. We'll reuse vop64_3() for other instructions later.
3. Take care of modifying the vector register only if no trap happened.
 - on traps, flags are not updated and no elements are modified
 - traps don't modify the fpc flags
 - without traps, all exceptions of all elements are merged
4. We'll reuse check_ieee_exc() later when we need the XxC flag.

We have to check for exceptions after processing each element.
Provide separate handlers for single/all element processing. We'll do
the same for all applicable FP instructions.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 13b0228f77 s390x/tcg: Implement VECTOR STRING RANGE COMPARE
Unfortunately, there is no easy way to avoid looping over all elements
in v2. Provide specialized variants for !cc,!rt/!cc,rt/cc,!rt/cc,rt and
all element types. Especially for different values of rt, the compiler
might be able to optimize the code a lot.

Add s390_vec_write_element().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand be6324c6b7 s390x/tcg: Implement VECTOR ISOLATE STRING
Logic mostly courtesy of Richard H.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 074e99b3b5 s390x/tcg: Implement VECTOR FIND ELEMENT NOT EQUAL
Similar to VECTOR FIND ELEMENT EQUAL. Core logic courtesy of Richard H.

Add s390_vec_read_element() that can deal with element sizes.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 8c0e1e58ce s390x/tcg: Implement VECTOR FIND ELEMENT EQUAL
Core logic courtesy of Richard H.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand 1fd286385c s390x/tcg: Implement VECTOR FIND ANY ELEMENT EQUAL
Complicated stuff. Provide two different helpers for CC an !CC handling.
We might want to add more helpers later.

zero_search() and match_index() are courtesy of Richard H.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-06-07 14:53:25 +02:00
David Hildenbrand db156ebfae s390x/tcg: Implement VECTOR TEST UNDER MASK
Let's return the cc value directly via cpu_env. Unfortunately there
isn't a simple way to calculate the value lazily - one would have to
calculate and store e.g. the population count of the mask and the
result so it can be evaluated in a cc helper.

But as VTM only sets the cc, we can assume the value will be needed soon
either way.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00
David Hildenbrand 1ee2d7ba72 s390x/tcg: Implement VECTOR SUBTRACT COMPUTE BORROW INDICATION
Let's keep it simple for now and handle 8/16 bit elements via helpers.
Especially for 8/16, we could come up with some bit tricks.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00
David Hildenbrand 8112274f86 s390x/tcg: Implement VECTOR SHIFT RIGHT LOGICAL *
Similar to VECTOR SHIFT RIGHT ARITHMETICAL.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00
David Hildenbrand 5f724887e3 s390x/tcg: Implement VECTOR SHIFT RIGHT ARITHMETIC
Similar to VECTOR SHIFT LEFT ARITHMETIC. Add s390_vec_sar() similar to
s390_vec_shr().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00
David Hildenbrand dea33fc31b s390x/tcg: Implement VECTOR SHIFT LEFT (BY BYTE)
We can reuse the existing 128-bit shift utility function.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00
David Hildenbrand 5c4b0ab460 s390x/tcg: Implement VECTOR ELEMENT ROTATE AND INSERT UNDER MASK
Use the new vector expansion for GVecGen3i.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00
David Hildenbrand 55236da222 s390x/tcg: Implement VECTOR ELEMENT ROTATE LEFT LOGICAL
Take care of properly taking the modulo of the count. We might later
want to come back and create a variant of VERLL where the base register
is 0, resulting in an immediate.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00
David Hildenbrand c3838aaae0 s390x/tcg: Implement VECTOR POPULATION COUNT
Similar to VECTOR COUNT TRAILING ZEROES.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00
David Hildenbrand 2bf3ee38f1 s390x/tcg: Implement VECTOR MULTIPLY *
Yet another set of variants. Implement it similar to VECTOR MULTIPLY AND
ADD *. At least for one variant we have a gvec helper we can reuse.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00
David Hildenbrand 1b430aec41 s390x/tcg: Implement VECTOR MULTIPLY AND ADD *
Quite some variants to handle. At least handle some 32-bit element
variants via gvec expansion (we could also handle 16/32-bit variants
for ODD and EVEN easily via gvec expansion, but let's keep it simple
for now).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00
David Hildenbrand 697a45d695 s390x/tcg: Implement VECTOR GALOIS FIELD MULTIPLY SUM (AND ACCUMULATE)
A galois field multiplication in field 2 is like binary multiplication,
however instead of doing ordinary binary additions, xor's are performed.
So no carries are considered.

Implement all variants via helpers. s390_vec_sar() and s390_vec_shr()
will be reused later on.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00
David Hildenbrand 449a8ac250 s390x/tcg: Implement VECTOR COUNT TRAILING ZEROS
Implement it similar to VECTOR COUNT LEADING ZEROS.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2019-05-17 10:54:13 +02:00