We can reuse some of the infrastructure introduced for
VECTOR FP CONVERT FROM FIXED 64-BIT and friends.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Take care of reading/indicating the 32-bit elements.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can reuse most of the infrastructure introduced for
VECTOR FP CONVERT FROM FIXED 64-BIT and friends.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can reuse most of the infrastructure added for VECTOR FP ADD.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
1. We'll reuse op_vcdg() for similar instructions later, prepare for
that.
2. We'll reuse vop64_2() later for other instructions.
We have to mangle the erm (effective rounding mode) and the m4 into
the simd_data(), and properly unmangle them again.
Make sure to restore the erm before triggering an exception.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Provide for all three instructions all four combinations of cc bit and
s bit.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
As far as I can see, there is only a tiny difference.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
1. We'll reuse op_vfa() for similar instructions later, prepare for
that.
2. We'll reuse vop64_3() for other instructions later.
3. Take care of modifying the vector register only if no trap happened.
- on traps, flags are not updated and no elements are modified
- traps don't modify the fpc flags
- without traps, all exceptions of all elements are merged
4. We'll reuse check_ieee_exc() later when we need the XxC flag.
We have to check for exceptions after processing each element.
Provide separate handlers for single/all element processing. We'll do
the same for all applicable FP instructions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Unfortunately, there is no easy way to avoid looping over all elements
in v2. Provide specialized variants for !cc,!rt/!cc,rt/cc,!rt/cc,rt and
all element types. Especially for different values of rt, the compiler
might be able to optimize the code a lot.
Add s390_vec_write_element().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR FIND ELEMENT EQUAL. Core logic courtesy of Richard H.
Add s390_vec_read_element() that can deal with element sizes.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Complicated stuff. Provide two different helpers for CC an !CC handling.
We might want to add more helpers later.
zero_search() and match_index() are courtesy of Richard H.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's return the cc value directly via cpu_env. Unfortunately there
isn't a simple way to calculate the value lazily - one would have to
calculate and store e.g. the population count of the mask and the
result so it can be evaluated in a cc helper.
But as VTM only sets the cc, we can assume the value will be needed soon
either way.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's keep it simple for now and handle 8/16 bit elements via helpers.
Especially for 8/16, we could come up with some bit tricks.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR SHIFT RIGHT ARITHMETICAL.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR SHIFT LEFT ARITHMETIC. Add s390_vec_sar() similar to
s390_vec_shr().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can reuse the existing 128-bit shift utility function.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Use the new vector expansion for GVecGen3i.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Take care of properly taking the modulo of the count. We might later
want to come back and create a variant of VERLL where the base register
is 0, resulting in an immediate.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR COUNT TRAILING ZEROES.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Yet another set of variants. Implement it similar to VECTOR MULTIPLY AND
ADD *. At least for one variant we have a gvec helper we can reuse.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Quite some variants to handle. At least handle some 32-bit element
variants via gvec expansion (we could also handle 16/32-bit variants
for ODD and EVEN easily via gvec expansion, but let's keep it simple
for now).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
A galois field multiplication in field 2 is like binary multiplication,
however instead of doing ordinary binary additions, xor's are performed.
So no carries are considered.
Implement all variants via helpers. s390_vec_sar() and s390_vec_shr()
will be reused later on.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Implement it similar to VECTOR COUNT LEADING ZEROS.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
For 8/16, use the 32 bit variant and properly subtract the added
leading zero bits.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR AVERAGE but without sign extension.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Handle 32/64-bit elements via gvec expansion and the 8/16 bits via
ool helpers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Very similar to VECTOR LOAD WITH LENGTH, just the opposite direction.
Properly probe write access before modifying memory.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-32-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Instead of checking e.g. the first access on every touched page, we should
check the actual access, otherwise we might get false positives when Low
Address Protection (LAP) is active. As probe_write() can only deal with
accesses to one page, we have to loop.
Use i64 for the length, although not needed - easier to reuse
TCG temps we already have in the translation functions where this will
be used. Also allow it to be used from other helpers.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-28-david@redhat.com>
[CH: add missing page_check_range()]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Take care of overlying inputs and outputs by using a temporary vector.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-21-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This is a big one. Luckily we only have a limited set of such nasty
instructions.
We'll implement all variants with helpers, except when sources and
the destination don't overlap for VECTOR PACK. Provide different helpers
when the cc is to be modified. We'll return the cc then via env->cc_op.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-20-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Very similar to LOAD COUNT TO BLOCK BOUNDARY, but instead of only
calculating, the actual vector is loaded. Use a temporary vector to
not modify the real vector on exceptions. Initialize that one to zero,
to not leak any data. Provide a fast path if we're loading a full
vector.
As we don't have gvec ool handlers for single vectors, just calculate
the vector address manually.
We can reuse the helper later on for VECTOR LOAD WITH LENGTH. In fact,
we are going to name it "vll" right from the beginning, because that's
a better match.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-15-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
With the floating-point extension facility, LOAD ROUNDED has
a rounding mode specification and the inexact-exception control (XxC).
Handle them just like e.g. LOAD FP INTEGER.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-14-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's split handling of BFP/DFP rounding mode configuration. Also,
let's not reuse the sfpc handler, use a separate handler so we can
properly check for specification exceptions for SRNMB.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-10-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We already forward the 3 bits correctly in the translation functions. We
also have to handle them properly and check for specification
exceptions.
Setting an invalid rounding mode (BFP only, all DFP rounding modes)
results in a specification exception. Setting unassigned bits in the
fpc, results in a specification exception.
This fixes LOAD FPC (AND SIGNAL), SET FPC (AND SIGNAL). Also for,
SET BFP ROUNDING MODE, 3-bit rounding mode is now explicitly checked.
Note: TCG_CALL_NO_WG is required for sfpc handler, as we now inject
exceptions.
We won't be modeling abscence of the "floating-point extension facility"
for now, not necessary as most take the facility for granted without
checking.
z14 PoP, 9-23, "LOAD FPC"
When the floating-point extension facility is
installed, bits 29-31 of the second operand must
specify a valid BFP rounding mode and bits 6-7,
14-15, 24, and 28 must be zero; otherwise, a
specification exception is recognized.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-9-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This is a non-privileged instruction that was only implemented
for system mode. However, the stck instruction is used by glibc,
so this was causing SIGILL for programs run under debian stretch.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190212053044.29015-3-richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The DXC is to be stored in the low core, and only in the FPC in case AFP
is enabled in CR0. Stub is not required in current code, but this way
we never run into problems.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180927130303.12236-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This allows a guest to change its TOD. We already take care of updating
all CKC timers from within S390TODClass.
Use MO_ALIGN to load the operand manually - this will properly trigger a
SPECIFICATION exception.
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180627134410.4901-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
On s390x, pci support is implemented via a set of instructions
(no mmio). Unfortunately, none of them are documented in the
PoP; the code is based upon the existing implementation for KVM
and the Linux zpci driver.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Use s390_cpu_virt_mem_write() so we can actually revert what we did
(re-inject the dequeued IO interrupt).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-10-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
CC == 2 can only happen due to a protection exception, not if memory is
not available (PGM_ADDRESSING). So all PGM_ADDRESSING exceptions have to
be forwarded to the guest.
Since the initial definition of TEST PROTECTION, we now read globals
(e.g. PSW mask), so we have to correctly mark the instruction
(otherwise, e.g. booting fedora 27 fails).
Also, the architecture explicitly specifies which exceptions are
forwarded to the guest, this makes the code a little nicer.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180112125452.8569-1-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Linux uses TEST PROTECTION to sense for available memory locations.
Let's implement what we can for now (just as for the other instructions,
excluding AR mode and special protection mechanisms).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171218224616.21030-2-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
It only provides the EXTRACT CPU TIME instruction. We can reuse the stpt
helper, which calculates the CPU timer value.
As the instruction is not privileged, but we don't have a CPU timer
value in case of linux user, we simply reuse cpu_get_host_ticks() to
produce some descending value.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171208160207.26494-13-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's just wire it up like KVM.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171208160207.26494-10-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's handle it just like KVM:
Depending on the model, this instruction may not be
provided. When this instruction is not provided, it is
checked for operand exception and privileged-opera-
tion exception, and then is suppressed.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171208160207.26494-9-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>