The second word has been loaded from the unincremented
address since the first commit.
Fixes: 44ac14b06f
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190322234302.12770-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Don't announce that exit simcall has been invoked: this is just noise.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
We spell out sub/dir/ in sub/dir/trace-events' comments pointing to
source files. That's because when trace-events got split up, the
comments were moved verbatim.
Delete the sub/dir/ part from these comments. Gets rid of several
misspellings.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190314180929.27722-3-armbru@redhat.com
Message-Id: <20190314180929.27722-3-armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
While running the GCC test suite against 4.0.0-rc0, Kito found a
regression introduced by the decodetree conversion that caused divuw and
remuw to sign-extend their inputs. The ISA manual says they are
supposed to be zero extended:
DIVW and DIVUW instructions are only valid for RV64, and divide the
lower 32 bits of rs1 by the lower 32 bits of rs2, treating them as
signed and unsigned integers respectively, placing the 32-bit
quotient in rd, sign-extended to 64 bits. REMW and REMUW
instructions are only valid for RV64, and provide the corresponding
signed and unsigned remainder operations respectively. Both REMW
and REMUW always sign-extend the 32-bit result to 64 bits, including
on a divide by zero.
Here's Kito's reduced test case from the GCC test suite
unsigned calc_mp(unsigned mod)
{
unsigned a,b,c;
c=-1;
a=c/mod;
b=0-a*mod;
if (b > mod) { a += 1; b-=mod; }
return b;
}
int main(int argc, char *argv[])
{
unsigned x = 1234;
unsigned y = calc_mp(x);
if ((sizeof (y) == 4 && y != 680)
|| (sizeof (y) == 2 && y != 134))
abort ();
exit (0);
}
I haven't done any other testing on this, but it does fix the test case.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
break_dependency incorrectly handles the case of dependency on an opcode
that references the same register multiple times. E.g. the following
instruction is translated incorrectly:
{ or a2, a3, a3 ; or a3, a2, a2 }
This happens because resource indices of both dependency graph nodes are
incremented, and a copy for the second instance of the same register in
the ending node is not done.
Only increment resource index of the ending node of the dependency.
Add test.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Currently, the Cascadelake-Server, Icelake-Client, and
Icelake-Server are always generating the following warning:
qemu-system-x86_64: warning: \
host doesn't support requested feature: CPUID.07H:ECX [bit 4]
This happens because OSPKE was never returned by
GET_SUPPORTED_CPUID or x86_cpu_get_supported_feature_word().
OSPKE is a runtime flag automatically set by the KVM module or by
TCG code, was always cleared by x86_cpu_filter_features(), and
was not supposed to appear on the CPU model table.
Remove the OSPKE flag from the CPU model table entries, to avoid
the bogus warning and avoid returning invalid feature data on
query-cpu-* QMP commands. As OSPKE was always cleared by
x86_cpu_filter_features(), this won't have any guest-visible
impact.
Include a test case that should detect the problem if we introduce
a similar bug again.
Fixes: c7a88b52f6 ("i386: Add new model of Cascadelake-Server")
Fixes: 8a11c62da9 ("i386: Add new CPU model Icelake-{Server,Client}")
Cc: Tao Xu <tao3.xu@intel.com>
Cc: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190319200515.14999-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Now that kvm_arch_get_supported_cpuid() will only return
arch_capabilities if QEMU is able to initialize the MSR properly,
we know that the feature is safely migratable.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190125220606.4864-3-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
KVM has two bugs in the handling of MSR_IA32_ARCH_CAPABILITIES:
1) Linux commit commit 1eaafe91a0df ("kvm: x86: IA32_ARCH_CAPABILITIES
is always supported") makes GET_SUPPORTED_CPUID return
arch_capabilities even if running on SVM. This makes "-cpu
host,migratable=off" incorrectly expose arch_capabilities on CPUID on
AMD hosts (where the MSR is not emulated by KVM).
2) KVM_GET_MSR_INDEX_LIST does not return MSR_IA32_ARCH_CAPABILITIES if
the MSR is not supported by the host CPU. This makes QEMU not
initialize the MSR properly at kvm_put_msrs() on those hosts.
Work around both bugs on the QEMU side, by checking if the MSR
was returned by KVM_GET_MSR_INDEX_LIST before returning the
feature flag on kvm_arch_get_supported_cpuid().
This has the unfortunate side effect of making arch_capabilities
unavailable on hosts without hardware support for the MSR until bug #2
is fixed on KVM, but I can't see another way to work around bug #1
without that side effect.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190125220606.4864-2-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Alistair Francis <Alistair.Francis@wdc.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Alistair Francis <Alistair.Francis@wdc.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
If vectored interrupts are enabled (bits[1:0]
of mtvec/stvec == 1) then use the following
logic for trap entry address calculation:
pc = mtvec + cause * 4
In addition to adding support for vectored interrupts
this patch simplifies the interrupt delivery logic
by making sync/async cause decoding and encoding
steps distinct.
The cause code and the sign bit indicating sync/async
is split at the beginning of the function and fixed
cause is renamed to cause. The MSB setting for async
traps is delayed until setting mcause/scause to allow
redundant variables to be eliminated. Some variables
are renamed for conciseness and moved so that decls
are at the start of the block.
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Alistair Francis <Alistair.Francis@wdc.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
This effectively changes riscv_cpu_update_mip
from edge to level. i.e. cpu_interrupt or
cpu_reset_interrupt are called regardless of
the current interrupt level.
Fixes WFI doesn't return when a IPI is issued:
- https://github.com/riscv/riscv-qemu/issues/132
To test:
1) Apply RISC-V Linux CPU hotplug patch:
- http://lists.infradead.org/pipermail/linux-riscv/2018-May/000603.html
2) Enable CONFIG_CPU_HOTPLUG in linux .config
3) Try to offline and online cpus:
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu2/online
Reported-by: Atish Patra <atishp04@gmail.com>
Cc: Atish Patra <atishp04@gmail.com>
Cc: Alistair Francis <Alistair.Francis@wdc.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
This change checks elf_flags for EF_RISCV_RVE and if
present uses the RVE linux syscall ABI which uses t0
for the syscall number instead of a7.
Warn and exit if a non-RVE ABI binary is run on a
cpu with the RVE extension as it is incompatible.
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Sagar Karandikar <sagark@eecs.berkeley.edu>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Alistair Francis <Alistair.Francis@wdc.com>
Co-authored-by: Kito Cheng <kito.cheng@gmail.com>
Co-authored-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
We can't allow the supervisor to control SEIP as this would allow the
supervisor to clear a pending external interrupt which will result in
lost a interrupt in the case a PLIC is attached. The SEIP bit must be
hardware controlled when a PLIC is attached.
This logic was previously hard-coded so SEIP was always masked even
if no PLIC was attached. This patch adds riscv_cpu_claim_interrupts
so that the PLIC can register control of SEIP. In the case of models
without a PLIC (spike), the SEIP bit remains software controlled.
This interface allows for hardware control of supervisor timer and
software interrupts by other interrupt controller models.
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Sagar Karandikar <sagark@eecs.berkeley.edu>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Alistair Francis <Alistair.Francis@wdc.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
The gdb CSR xml file has registers in documentation order, not numerical
order, so we need a table to map the register numbers. This also adds
fairly standard gdb hooks to access xml specified registers.
notice:
The fpu xml from gdb 8.3 has unused register #, 65 and make first
csr register # become 69. We register extra register on gdb to correct
csr offset calculation
Signed-off-by: Jim Wilson <jimw@sifive.com>
Signed-off-by: Chih-Min Chao <chihmin.chao@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Add a debugger field to CPURISCVState. Add riscv_csrrw_debug function
to set it. Disable mode checks when debugger field true.
Signed-off-by: Jim Wilson <jimw@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190212230903.9215-1-jimw@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
This adds some missing CSR_* register macros, and documents some as being
priv v1.9.1 specific.
Signed-off-by: Jim Wilson <jimw@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190212230830.9160-1-jimw@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
The RAM device presents a memory region that should be handled
as an IO region and should not be pinned.
In the case of the vfio-pci, RAM device represents a MMIO BAR
and the memory region is not backed by pages hence
KVM_MEMORY_ENCRYPT_REG_REGION fails to lock the memory range.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1667249
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Message-Id: <20190204222322.26766-3-brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
during the refactor to decodetree we removed the manual decoding that is
necessary for c.jal/c.addiw and removed the translation of c.flw/c.ld
and c.fsw/c.sd. This reintroduces the manual parsing and the
omited implementation.
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Tested-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Within a delay slot, we were squishing both DISAS_IAQ_N_STALE and
DISAS_IAQ_N_STALE_EXIT to DISAS_IAQ_N_UPDATED. This lost the
required exit to the main loop, and could result in interrupts
never being delivered.
Tested-by: Sven Schnelle <svens@stackframe.org>
Reported-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These instructions do not trap when SVE is disabled in EL0,
causing them to be executed with wrong size information.
Signed-off-by: Amir Charif <amir.charif@cea.fr>
Message-id: 1552579248-31025-1-git-send-email-amir.charif@cea.fr
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added 'target/arm' prefix to subject]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some generic arch timer registers are Config-RW in the EL0,
which means the EL0 exception level can have write permission
if it is appropriately configured.
When VM access registers, QEMU firstly checks whether they have RW
permission, then check whether it is appropriately configured.
If they are defined to read only in EL0, even though they have been
appropriately configured, they still do not have write permission.
So need to add the write permission according to ARMV8 spec when
define it.
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Message-id: 1552395177-12608-1-git-send-email-gengdongjiu@huawei.com
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Bastian: this patchset converts the RISC-V decoder to decodetree in four major steps:
1) Convert 32-bit instructions to decodetree [Patch 1-15]:
Many of the gen_* functions are called by the decode functions for 16-bit
and 32-bit functions. If we move translation code from the gen_*
functions to the generated trans_* functions of decode-tree, we get a lot of
duplication. Therefore, we mostly generate calls to the old gen_* function
which are properly replaced after step 2).
Each of the trans_ functions are grouped into files corresponding to their
ISA extension, e.g. addi which is in RV32I is translated in the file
'trans_rvi.inc.c'.
2) Convert 16-bit instructions to decodetree [Patch 16-18]:
All 16 bit instructions have a direct mapping to a 32 bit instruction. Thus,
we convert the arguments in the 16 bit trans_ function to the arguments of
the corresponding 32 bit instruction and call the 32 bit trans_ function.
3) Remove old manual decoding in gen_* function [Patch 19-29]:
this move all manual translation code into the trans_* instructions of
decode tree, such that we can remove the old decode_* functions.
Palmer: This, with some additional cleanup patches, passed Alistar's
testing on rv32 and rv64 as well as my testing on rv64, so I think it's
good to go. I've run my standard test against this exact tag.
I still don't have a Mac to try this on, sorry!
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlyJCVETHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRDvTKFQLMurQeF6D/0UPlrbX7Gq8aPjs/Obca39SzNuQRqc
BsFjy6sKm62iSCsawYRtdclqb0UT+5DaiR9TypoguG+FUrU4aiFTqVUCHkcBuFql
53gk3PGc/neODu9SZxWmDDv5qf7iZaDgngNFOy2zczHiL7+Cw0v0+iLBxNQmDWNI
pGrmLUgYBMLHQl6GouDLrVW0jzVOqPXlgFcRagnmvozFrYE56ArZqTnN/urxVvAM
FhXgNKpbYcAVnDE+ruVqeKcQFgjuGSooBO6wx2dWEhoqlpPKpE0ONZjxNKLjuv1a
MyCUoBowukGENceNAts1wCkIAjRP+rGNgC9c26MH4ZYvnj3ThBsX73iQ56goHnQp
Pc8BbSrftdQYayaG+Ba+rATLOBqvAZekmozzSV6EyqGyJLcnMZYDg+wBH2nhb9dD
wlyYYoKPJFLrhYwn2nYhRplFTMTZ+vAmLxehG6BzRgddfmnaOKAkUP4OiMeQ/PG/
n8dXZUqev+mwPRA0ddxQYxeoxnw11zNJPfvnfXg879SutFdLHb/D3ZfBiTXT8SBp
rMT8pnD0Pyi58MwdBFNas9woS/m8L6/lrMBfJ9VvMDKusPzjpgpdgw2Nf1/EUqQe
cdrsJpTAKhTeXXax/kSSOHWqtXxbKhbOA+GU/BkWr8dCCeZUM9+M20rfWjkj7oyM
FTQH3dfRT36FMw==
=t7se
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/palmer/tags/riscv-for-master-4.0-sf4' into staging
target/riscv: Convert to decodetree
Bastian: this patchset converts the RISC-V decoder to decodetree in four major steps:
1) Convert 32-bit instructions to decodetree [Patch 1-15]:
Many of the gen_* functions are called by the decode functions for 16-bit
and 32-bit functions. If we move translation code from the gen_*
functions to the generated trans_* functions of decode-tree, we get a lot of
duplication. Therefore, we mostly generate calls to the old gen_* function
which are properly replaced after step 2).
Each of the trans_ functions are grouped into files corresponding to their
ISA extension, e.g. addi which is in RV32I is translated in the file
'trans_rvi.inc.c'.
2) Convert 16-bit instructions to decodetree [Patch 16-18]:
All 16 bit instructions have a direct mapping to a 32 bit instruction. Thus,
we convert the arguments in the 16 bit trans_ function to the arguments of
the corresponding 32 bit instruction and call the 32 bit trans_ function.
3) Remove old manual decoding in gen_* function [Patch 19-29]:
this move all manual translation code into the trans_* instructions of
decode tree, such that we can remove the old decode_* functions.
Palmer: This, with some additional cleanup patches, passed Alistar's
testing on rv32 and rv64 as well as my testing on rv64, so I think it's
good to go. I've run my standard test against this exact tag.
I still don't have a Mac to try this on, sorry!
# gpg: Signature made Wed 13 Mar 2019 13:44:49 GMT
# gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
# gpg: issuer "palmer@dabbelt.com"
# gpg: Good signature from "Palmer Dabbelt <palmer@dabbelt.com>" [unknown]
# gpg: aka "Palmer Dabbelt <palmer@sifive.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 00CE 76D1 8349 60DF CE88 6DF8 EF4C A150 2CCB AB41
* remotes/palmer/tags/riscv-for-master-4.0-sf4: (29 commits)
target/riscv: Remove decode_RV32_64G()
target/riscv: Remove gen_system()
target/riscv: Rename trans_arith to gen_arith
target/riscv: Remove manual decoding of RV32/64M insn
target/riscv: Remove shift and slt insn manual decoding
target/riscv: make ADD/SUB/OR/XOR/AND insn use arg lists
target/riscv: Move gen_arith_imm() decoding into trans_* functions
target/riscv: Remove manual decoding from gen_store()
target/riscv: Remove manual decoding from gen_load()
target/riscv: Remove manual decoding from gen_branch()
target/riscv: Remove gen_jalr()
target/riscv: Convert quadrant 2 of RVXC insns to decodetree
target/riscv: Convert quadrant 1 of RVXC insns to decodetree
target/riscv: Convert quadrant 0 of RVXC insns to decodetree
target/riscv: Convert RV priv insns to decodetree
target/riscv: Convert RV64D insns to decodetree
target/riscv: Convert RV32D insns to decodetree
target/riscv: Convert RV64F insns to decodetree
target/riscv: Convert RV32F insns to decodetree
target/riscv: Convert RV64A insns to decodetree
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
decodetree handles all instructions now so the fallback is not necessary
anymore.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
with all 16 bit insns moved to decodetree no path is falling back to
gen_system(), so we can remove it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
manual decoding in gen_arith() is not necessary with decodetree. For now
the function is called trans_arith as the original gen_arith still
exists. The former will be renamed to gen_arith as soon as the old
gen_arith can be removed.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
gen_arith_imm() does a lot of decoding manually, which was hard to read
in case of the shift instructions and is not necessary anymore with
decodetree.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
With decodetree we don't need to convert RISC-V opcodes into to MemOps
as the old gen_store() did.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
With decodetree we don't need to convert RISC-V opcodes into to MemOps
as the old gen_load() did.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
We now utilizes argument-sets of decodetree such that no manual
decoding is necessary.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
trans_jalr() is the only caller, so move the code into trans_jalr().
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
we cannot remove the call to gen_arith() in decode_RV32_64G() since it
is used to translate multiply instructions.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
this splits the 64-bit only instructions into its own decode file such
that we generate the decoder for these instructions only for the RISC-V
64 bit target.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
for now only LUI & AUIPC are decoded and translated. If decodetree fails, we
fall back to the old decoder.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
The current code assumes that we don't need to exit the TB
if a Data Cache Flush or Insert has happend. However, as we
have a shared Data/Instruction TLB, a Data cache flush also
flushes Instruction TLB entries, and a Data cache TLB insert
might also evict a Instruction TLB entry.
So exit the TB in all cases if Instruction translation is enabled.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-11-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-10-svens@stackframe.org>
[rth: Add required tlb flushing when prot id registers change.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The ODE software calls itlbp on existing TLB entries without
calling itlba first, so this seems to be valid.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-9-svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
b,gate does GR[t] ← cat(GR[t]{0..29},IAOQ_Front{30..31});
instead of saving the link address to register t.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-8-svens@stackframe.org>
[rth: Move link check outside of ifndef CONFIG_USER_ONLY;
use ctx->privilege; nullify the insn earlier.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
DIAG is usually only used by diagnostics software as it's CPU
specific. In most of the cases it's better to ignore it and log
a message that it's not implemented.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-7-svens@stackframe.org>
[rth: Free the nullify condition.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
HP ODE use rfi to set the Q bit, and i don't see anything in the
documentation that this is forbidden. So remove it.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-6-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
To ease TLB debugging add a few trace events, which are disabled
by default so that there's no performance impact.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-5-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-4-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Assume the following sequence:
pitlbe r0(sr0,r0)
iitlba r4,(sr0,r0)
ldil L%3000000,r5
iitlbp r5,(sr0,r0)
This will purge the whole TLB and add an entry for page 0. However
the current TLB implementation in helper_iitlba() will store to
the last empty TLB entry, while helper_iitlbp() will write to the
first empty entry. That is because an empty entry will match address
0 in helper_iitlba()
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-3-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When one of the source registers is the same as the destination register,
the source register gets overwritten with the destionation value before
do_add_sv() is called, which leads to unexpection condition matches.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-2-svens@stackframe.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We got away with eliding this check when target/hppa was user-only,
but missed adding this check when adding system support.
Fixes an early crash in the HP-UX 11 installer.
Reported-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20190309214255.9952-3-f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The t0 tcg_temp register is now unused, remove it.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20190309214255.9952-2-f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We now have enough support to boot a PowerNV machine with a POWER9
processor. Allow HV mode on POWER9.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-16-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that all VSX registers are stored in host endian order, there is no need
to go via different accessors depending upon the register number. Instead we
introduce vsr64_offset() and use it directly from within get_cpu_vsr{l,h}() and
set_cpu_vsr{l,h}().
This also allows us to rewrite avr64_offset() and fpr_offset() in terms of the
new vsr64_offset() function to more clearly express the relationship between the
VSX, FPR and VMX registers, and also remove vsrl_offset() which is no longer
required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-8-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When VSX support was initially added, the fpr registers were added at
offset 0 of the VSR register and the vsrl registers were added at offset
1. This is in contrast to the VMX registers (the last 32 VSX registers) which
are stored in host-endian order.
Switch the fpr/vsrl registers so that the lower 32 VSX registers are now also
stored in host endian order to match the VMX registers. This ensures that TCG
vector operations involving mixed VMX and VSX registers will function
correctly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
By using the VsrD macro in avr64_offset() the same offset calculation can be
used regardless of the host endian. This allows get_avr64() and set_avr64() to
be simplified accordingly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-6-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All TCG vector operations require pointers to the base address of the vector
rather than separate access to the top and bottom 64-bits. Convert the VMX TCG
instructions to use a new avr_full_offset() function instead of avr64_offset()
which can then itself be written as a simple wrapper onto vsr_full_offset().
This same function can also reused in cpu_avr_ptr() to avoid having more than
one copy of the offset calculation logic.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-5-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It isn't possible to include internal.h from cpu.h so move the Vsr* macros
into cpu.h alongside the other VMX/VSX register access functions.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-4-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead of having multiple copies of the offset calculation logic, move it to a
single vsrl_offset() function.
This commit also renames the existing get_vsr()/set_vsr() functions to
get_vsrl()/set_vsrl() which better describes their purpose.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead of having multiple copies of the offset calculation logic, move it to a
single fpr_offset() function.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The H_CALL H_PAGE_INIT can be used to zero or copy a page of guest
memory. Enable the in-kernel H_PAGE_INIT handler.
The in-kernel handler takes half the time to complete compared to
handling the H_CALL in userspace.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190306060608.19935-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There are four scenarios being handled in this function:
- single stepping
- hardware breakpoints
- software breakpoints
- fallback (no debug supported)
A future patch will add code to handle specific single step and
software breakpoints cases so let's split each scenario into its own
function now to avoid hurting readability.
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190228225759.21328-5-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is in preparation for a refactoring of the kvm_handle_debug
function in the next patch.
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20190228225759.21328-4-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introduce a new spapr_cap SPAPR_CAP_CCF_ASSIST to be used to indicate
the requirement for a hw-assisted version of the count cache flush
workaround.
The count cache flush workaround is a software workaround which can be
used to flush the count cache on context switch. Some revisions of
hardware may have a hardware accelerated flush, in which case the
software flush can be shortened. This cap is used to set the
availability of such hardware acceleration for the count cache flush
routine.
The availability of such hardware acceleration is indicated by the
H_CPU_CHAR_BCCTR_FLUSH_ASSIST flag being set in the characteristics
returned from the KVM_PPC_GET_CPU_CHAR ioctl.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301031912.28809-2-sjitindarsingh@gmail.com>
[dwg: Small style fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spapr_cap SPAPR_CAP_IBS is used to indicate the level of capability
for mitigations for indirect branch speculation. Currently the available
values are broken (default), fixed-ibs (fixed by serialising indirect
branches) and fixed-ccd (fixed by diabling the count cache).
Introduce a new value for this capability denoted workaround, meaning that
software can work around the issue by flushing the count cache on
context switch. This option is available if the hypervisor sets the
H_CPU_BEHAV_FLUSH_COUNT_CACHE flag in the cpu behaviours returned from
the KVM_PPC_GET_CPU_CHAR ioctl.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Message-Id: <20190301031912.28809-1-sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Implement support to allow KVM guests to take advantage of the large
decrementer introduced on POWER9 cpus.
To determine if the host can support the requested large decrementer
size, we check it matches that specified in the ibm,dec-bits device-tree
property. We also need to enable it in KVM by setting the LPCR_LD bit in
the LPCR. Note that to do this we need to try and set the bit, then read
it back to check the host allowed us to set it, if so we can use it but
if we were unable to set it the host cannot support it and we must not
use the large decrementer.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190301024317.22137-3-sjitindarsingh@gmail.com>
[dwg: Small style fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Prior to POWER9 the decrementer was a 32-bit register which decremented
with each tick of the timebase. From POWER9 onwards the decrementer can
be set to operate in a mode called large decrementer where it acts as a
n-bit decrementing register which is visible as a 64-bit register, that
is the value of the decrementer is sign extended to 64 bits (where n is
implementation dependant).
The mode in which the decrementer operates is controlled by the LPCR_LD
bit in the logical paritition control register (LPCR).
>From POWER9 onwards the HDEC (hypervisor decrementer) was enlarged to
h-bits, also sign extended to 64 bits (where h is implementation
dependant). Note this isn't configurable and is always enabled.
On POWER9 the large decrementer and hdec are both 56 bits, as
represented by the lrg_decr_bits cpu class property. Since they are the
same size we only add one property for now, which could be extended in
the case they ever differ in the future.
We also add the lrg_decr_bits property for POWER5+/7/8 since it is used
to determine the size of the hdec, which is only generated on the
POWER5+ processor and later. On these processors it is 32 bits.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190301024317.22137-2-sjitindarsingh@gmail.com>
[dwg: Small style fixes]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Intel Processor Trace required CPUID[0x14] but the cpuid_level
have no change when create a kvm guest with
e.g. "-cpu qemu64,+intel-pt".
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1548805979-12321-1-git-send-email-luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CPUID code will call kvm_arch_get_supported_cpuid() and, even though
it is undef kvm_enabled() so it never runs for user-mode emulators,
sometimes clang will not optimize it out at -O0.
That could be considered a compiler bug, however at -O0 we give it
a pass and just add the stubs.
Reported-by: Kamil Rytarowski <n54@gmx.com>
Tested-by: Kamil Rytarowski <n54@gmx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Combine all variant in a single handler. As source and destination
have different element sizes, we can't use gvec expansion. Expand
manually. Also watch out for overlapping source and destination
registers. Use a safe evaluation order depending on the operation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-33-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Very similar to VECTOR LOAD WITH LENGTH, just the opposite direction.
Properly probe write access before modifying memory.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-32-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Similar to VECTOR LOAD MULTIPLE, just the opposite direction. Probe
write access first.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-31-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
As we only store one element, there is nothing to consider regarding
exceptions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-30-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Instead of checking e.g. the first access on every touched page, we should
check the actual access, otherwise we might get false positives when Low
Address Protection (LAP) is active. As probe_write() can only deal with
accesses to one page, we have to loop.
Use i64 for the length, although not needed - easier to reuse
TCG temps we already have in the translation functions where this will
be used. Also allow it to be used from other helpers.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-28-david@redhat.com>
[CH: add missing page_check_range()]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Load both elements signed and store them into the two 64 bit elements.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-27-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Provide an implementation based on i64 and on real host vectors.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-26-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Similar to VECTOR GATHER ELEMENT, but the other direction.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-25-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Like VECTOR REPLICATE, but the element to be replicated comes from an
immediate.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-24-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Replicate via the special gvec helper.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-23-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Read the whole input before modifying the destination vector.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-22-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Take care of overlying inputs and outputs by using a temporary vector.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-21-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This is a big one. Luckily we only have a limited set of such nasty
instructions.
We'll implement all variants with helpers, except when sources and
the destination don't overlap for VECTOR PACK. Provide different helpers
when the cc is to be modified. We'll return the cc then via env->cc_op.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-20-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We cannot use gvec expansion as source and destination elements are
have different element numbers. So we'll expand using a fancy loop.
Also, we have to take care of overlapping source and destination
registers, therefore use a safe evaluation irder depending on the
operation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-19-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We can reuse the helper introduced along with VECTOR LOAD TO BLOCK
BOUNDARY. We just have to take care of converting the highest index into
a length.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-18-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Fairly easy, just load from to gprs into a single vector.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-17-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Very similar to VECTOR LOAD GR FROM VR ELEMENT, just the opposite
direction. Also provide a fast path in case we don't care about the
register content.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-16-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Very similar to LOAD COUNT TO BLOCK BOUNDARY, but instead of only
calculating, the actual vector is loaded. Use a temporary vector to
not modify the real vector on exceptions. Initialize that one to zero,
to not leak any data. Provide a fast path if we're loading a full
vector.
As we don't have gvec ool handlers for single vectors, just calculate
the vector address manually.
We can reuse the helper later on for VECTOR LOAD WITH LENGTH. In fact,
we are going to name it "vll" right from the beginning, because that's
a better match.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-15-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Try to load the last element first. Access to the first element will
be checked afterwards. This way, we can guarantee that the vector is
not modified before we checked for all possible exceptions. (16 vectors
cannot cross more than two pages)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-14-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Fairly easy, zero out the vector before we load the desired element.
Load the element before touching the vector.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-13-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
To avoid an helper, we have to do the actual calculation of the element
address (offset in cpu_env + cpu_env) manually. Factor that out into
get_vec_element_ptr_i64(). The same logic will be reused for "VECTOR
LOAD VR ELEMENT FROM GR".
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-12-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Take care of properly sign-extending the immediate.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-11-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Fairly easy, load with desired size and store it into the right element.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-10-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We can use tcg_gen_gvec_dup_i64() to carry out the duplication.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-9-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
When loading from memory, load both elements into temps first before
modifying the target vector
Loading with strange alingment from the end of the address space will
not properly wrap, we can ignore that for now.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Add gen_gvec_dupi() for handling duplication of immediates, so it can
be reused later.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-7-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's optimize it for the common cases (setting a vector to zero or all
ones) - courtesy of Richard.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's start with a more involved one, but it is the first in the list
of vector support instructions (introduced with the vector facility).
Good thing is, we need a lot of basic infrastructure for this. Reading
and writing vector elements as well as checking element validity.
All vector instruction related translation functions will reside in
translate_vx.inc.c, to be included in translate.c - similar to how
other architectures handle it.
While at it, directly add some documentation (which contains parts about
things added in follow-up patches, but splitting this up does not make
too much sense). Also add ES_* defines heavily used later.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We'll have to read/write vector elements quite frequently from helpers.
The tricky bit is properly taking care of endianess. Handle it similar
to aarch64.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-4-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Check them at a central point. We'll use a new instruction flag to
flag all vector instructions (IF_VEC) and handle it very similar to
AFP, whereby we use another unused position in the PSW mask to store
the state of vector register enablement per translation block.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
These are the new instruction formats related to vector instructions as
up to the z14 (a.k.a. latest PoP).
As v2 appeares (like x2 in VRX) with d2/b2 in VRV, we have to assign it a
higher field number to avoid collisions.
Properly take care of the MSB (to be able to address 32 registers) for
each vector register field stored in the RXB field (Bit 36 - 30 for all
vector instructions). As we have 32 bit vector registers and the
"v" fields are only 4 bit in size, the 5th bit is stored in the RXB.
We use a new type to indicate that the MSB has to be fetched from the
RXB.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
There are some fields in our struct LowCore which apparently have
been copied from a very old version of the Linux kernel. These
fields are not architected in the "Principles of Operation", and
only used on these memory locations in Linux kernels older than
2.6.29. Newer Linux kernels moved the entries to different locations
or are not using them at all anymore. Thus we should never access
these fields from the QEMU side, so they should be removed.
While we're at it, also add a QEMU_BUILD_BUG_ON() statement to
assert that struct LowCore has the right size.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1551775581-27989-1-git-send-email-thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We can eliminate an extra TB in this case, which merely
loads a "return address" into rn.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For priv levels 1 & 2, we were doing so from do_ibranch_priv.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add the kvm_arm_get_max_vm_ipa_size() helper that returns the
number of bits in the IPA address space supported by KVM.
This capability needs to be known to create the VM with a
specific IPA max size (kvm_type passed along KVM_CREATE_VM ioctl.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20190304101339.25970-6-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This will allow sharing code that adjusts rmode beyond
the existing users.
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-10-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This decoding more closely matches the ARMv8.4 Table C4-6,
Encoding table for Data Processing - Register Group.
In particular, op2 == 0 is now more than just Add/sub (with carry).
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-7-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We do not need an out-of-line helper for manipulating bits in pstate.
While changing things, share the implementation of gen_ss_advance.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-6-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The EL0+UMA check is unique to DAIF. While SPSel had avoided the
check by nature of already checking EL >= 1, the other post v8.0
extensions to MSR (imm) allow EL0 and do not require UMA. Avoid
the unconditional write to pc and use raise_exception_ra to unwind.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Minimize the number of places that will need updating when
the virtual host extensions are added.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190301200501.16533-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Found by inspection: Rn is the base register against which the
load began; I is the register within the mask being processed.
The exception return should of course be processed from the loaded PC.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190301202921.21209-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The floating-point extension facility implemented certain changes to
BFP, HFP and DFP instructions.
As we don't implement HFP/DFP, we can ignore those completely. Related
to BFP, the changes include
- SET BFP ROUNDING MODE (SRNMB) instruction
- BFP-rounding-mode field in the FPC register is changed to 3 bits
- CONVERT FROM LOGICAL instructions
- CONVERT TO LOGICAL instructions
- Changes (rounding mode + XxC) added to
-- CONVERT TO FIXED
-- CONVERT FROM FIXED
-- LOAD FP INTEGER
-- LOAD ROUNDED
-- DIVIDE TO INTEGER
For TCG, we don't implement DIVIDE TO INTEGER, and it is harder to
implement, so skip that. Also, as we don't implement PFPO, we can skip
changes to that as well. The other parts are now implemented, we can
indicate the facility.
z14 PoP mentions that "The floating-point extension facility is installed
in the z/Architecture architectural mode. When bit 37 is one, bit 42 is
also one.", meaning that the DFP (decimal-floating-point) facility also
has to be indicated. We can ignore that for now.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-16-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
"round to nearest with ties away from 0" maps to float_round_ties_away.
"round to prepare for shorter precision" maps to float_round_to_odd.
As all instructions properly check for valid rounding modes in translate.c
we can add an assert. Fix one missing empty line.
Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-15-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
With the floating-point extension facility, LOAD ROUNDED has
a rounding mode specification and the inexact-exception control (XxC).
Handle them just like e.g. LOAD FP INTEGER.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-14-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
With the floating-point extension facility
- CONVERT FROM LOGICAL
- CONVERT TO LOGICAL
- CONVERT TO FIXED
- CONVERT FROM FIXED
- LOAD FP INTEGER
have both, a rounding mode specification and the inexact-exception control
(XxC). Other instructions will be handled separatly.
Check for valid rounding modes and forward also the XxC (via m4). To avoid
a lot of boilerplate code and changes to the helpers, combine both, the
m3 and m4 field in a combined 32 bit TCG variable. Perform checks at
a central place, taking in account if the m3 or m4 field was ignore
before the floating-point extension facility was introduced.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-13-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Some instructions allow to suppress IEEE inexact exceptions.
z14 PoP, 9-23, "Suppression of Certain IEEE Exceptions"
IEEE-inexact-exception control (XxC): Bit 1 of
the M4 field is the XxC bit. If XxC is zero, recogni-
tion of IEEE-inexact exception is not suppressed;
if XxC is one, recognition of IEEE-inexact excep-
tion is suppressed.
Especially, handling for overflow/unerflow remains as is, inexact is
reported along
z14 PoP, 9-23, "Suppression of Certain IEEE Exceptions"
For example, the IEEE-inexact-exception control (XxC)
has no effect on the DXC; that is, the DXC for IEEE-
overflow or IEEE-underflow exceptions along with the
detail for exact, inexact and truncated, or inexact and
incremented, is reported according to the actual con-
dition.
Follow up patches will wire it correctly up for the applicable
instructions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-12-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We want to reuse this in the context of vector instructions. So use
better matching names and introduce s390_restore_bfp_rounding_mode().
While at it, add proper newlines.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-11-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's split handling of BFP/DFP rounding mode configuration. Also,
let's not reuse the sfpc handler, use a separate handler so we can
properly check for specification exceptions for SRNMB.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-10-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We already forward the 3 bits correctly in the translation functions. We
also have to handle them properly and check for specification
exceptions.
Setting an invalid rounding mode (BFP only, all DFP rounding modes)
results in a specification exception. Setting unassigned bits in the
fpc, results in a specification exception.
This fixes LOAD FPC (AND SIGNAL), SET FPC (AND SIGNAL). Also for,
SET BFP ROUNDING MODE, 3-bit rounding mode is now explicitly checked.
Note: TCG_CALL_NO_WG is required for sfpc handler, as we now inject
exceptions.
We won't be modeling abscence of the "floating-point extension facility"
for now, not necessary as most take the facility for granted without
checking.
z14 PoP, 9-23, "LOAD FPC"
When the floating-point extension facility is
installed, bits 29-31 of the second operand must
specify a valid BFP rounding mode and bits 6-7,
14-15, 24, and 28 must be zero; otherwise, a
specification exception is recognized.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-9-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The trap is triggered based on priority of the enabled signaling flags.
Only overflow and underflow allow a concurrent inexact exception.
z14 PoP, 9-33, Figure 9-21
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We can directly work on the uint64_t value, no need for a temporary
uint32_t value.
Also cleanup and shorten the comments.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-7-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
IEEE underflows are not reported when the mask bit is off and we don't
also have an inexact exception.
z14 PoP, 9-20, "IEEE Underflow":
An IEEE-underflow exception is recognized for an
IEEE target when the tininess condition exists and
either: (1) the IEEE-underflow mask bit in the FPC
register is zero and the result value is inexact, or (2)
the IEEE-underflow mask bit in the FPC register is
one.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Many things are wrong and some parts cannot be fixed yet. Fix what we
can fix easily and add two FIXMEs:
The fpc flags are not updated in case an exception is actually injected.
Inexact exceptions have to be handled separately, as they are the only
exceptions that can coexist with underflows and overflows.
I reread the horribly complicated chapters in the PoP at least 5 times
and hope I got it right.
For references:
- z14 PoP, 9-18, "IEEE Exceptions"
- z14 PoP, 19-9, Figure 19-8
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We want to reuse that function in vector instruction context. While at it,
cleanup the code, using defines for magic values and avoiding the
handcrafted bit conversion.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190218122710.23639-4-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>