Provide what will become a larger context for splitting
the very large tcg_optimize function.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Prepare for tracking different masks by renaming this one.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Having observed e.g. al8+leq in dumps, canonicalize to al+leq.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These functions have been replaced by cpu_*_mmu as the
most proper interface to use from target code.
Hide these declarations from code that should not use them.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The unsigned saturations are handled via generic code
using min/max. The signed saturations are expanded using
double-sized arithmetic and a saturating pack.
Since all operations are done via expansion, do not
actually set TCG_TARGET_HAS_sat_vec.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These logical and arithmetic operations are optional but trivial.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Implementing add, sub, and, or, xor as the minimal set.
This allows us to actually enable vectors in query_s390_facilities.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add registers and function stubs. The functionality
is disabled via squashing s390_facilities[2] to 0.
We must still include results for the mandatory opcodes in
tcg_target_op_def, as all opcodes are checked during tcg init.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
They are rightly values in the same enumeration.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will shortly need to be able to check facilities beyond the
first 64. Instead of explicitly masking against s390_facilities,
create a HAVE_FACILITY macro that indexes an array.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Change name to HAVE_FACILITY (david)
This emphasizes that we don't support s390, only 64-bit s390x hosts.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For usadd, we only have to consider overflow. Since ~B + B == -1,
the maximum value for A that saturates is ~B.
For ussub, we only have to consider underflow. The minimum value
that saturates to 0 from A - B is B.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There is no point in encoding load/store within a bit of
the memory trace info operand. Represent atomic operations
as a single read-modify-write tracepoint. Use MemOpIdx
instead of inventing a form specifically for traces.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the MemOpIdx directly, rather than the rearrangement
of the same bits currently done by the trace infrastructure.
Pass in enum qemu_plugin_mem_rw so that we are able to treat
read-modify-write operations as a single operation.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We (will) often have the complete MemOpIdx handy, so use that.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We're about to move this out of tcg.h, so rename it
as we did when moving MemOp.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have lacked expressive support for memory sizes larger
than 64-bits for a while. Fixing that requires adjustment
to several points where we used this for array indexing,
and two places that develop -Wswitch warnings after the change.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This version of tcg_out_mov is emits a nop to fill the
delay slot if the move is not required.
The only current use, for INDEX_op_goto_ptr, will always
require the move but properly documents the delay slot.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Let the compiler decide about inlining.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Only use indirect jumps. Finish weaning away from the
unique alignment requirements for code_gen_buffer.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Weaning off of unique alignment requirements, so allow JAL
to not reach the target. TCG_TMP1 is always available for
use as a scratch because it is clobbered by the subroutine
being called.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Let the compiler decide about inlining.
Remove tcg_out_ext8s and tcg_out_ext16s as unused.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Let the compiler decide about inlining.
Remove tcg_out_nop as unused.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have already computed the rotated value of the imm8
portion of the complete imm12 encoding. No sense leaving
the combination of rot + rotation to the caller.
Create an encode_imm12_nofail helper that performs an assert.
This removes the final use of the local "rotl" function,
which duplicated our generic "rol32" function.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Expand these hard-coded instructions symbolically.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
ARMv4T has BX as its only interworking instruction. In order
to support testing of different architecture revisions with a
qemu binary that may have been built for, say ARMv6T2, fill in
the blank required to make calls to helpers in thumb mode.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
According to the Arm ARM DDI 0406C, section A1.3, the valid variants
are ARMv5T, ARMv5TE, ARMv5TEJ -- there is no ARMv5 without Thumb.
Therefore simplify the test from preprocessor ifdefs to base
architecture revision. Retain the "t" in the name to minimize churn.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Some of the functions specified _reg, some _imm, and some
left it blank. Make it clearer to which we are referring.
Split tcg_out_b_reg from tcg_out_bx_reg, to indicate when
we do not actually require BX semantics.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
GCC since 4.8 provides the definition and we now require 7.5.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit 5e8892db93 fixed several function signatures but tcg_out_vec_op
for arm is missing. It causes a build error on armv6 and armv7:
tcg-target.c.inc:2718:42: error: argument 5 of type 'const TCGArg *'
{aka 'const unsigned int *'} declared as a pointer [-Werror=array-parameter=]
const TCGArg *args, const int *const_args)
~~~~~~~~~~~~~~^~~~
../tcg/tcg.c:120:41: note: previously declared as an array 'const TCGArg[16]'
{aka 'const unsigned int[16]'}
const TCGArg args[TCG_MAX_OP_ARGS],
~~~~~~~~~~~~~~^~~~
Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210908185338.7927-1-jziviani@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Clang only sets _CALL_ELF for ppc64, and nothing at all to specify
the ABI for ppc32. Make a good guess based on other symbols.
Reported-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If __APPLE__, ensure that _CALL_DARWIN is set, then remove
our local TCG_TARGET_CALL_DARWIN.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We need to be able to represent VEX.W on a 32-bit host, where REX.W
will always be zero. Fixes the encoding for VPSLLVQ and VPSRLVQ.
Fixes: a2ce146a06 ("tcg/i386: Support vector variable shift opcodes")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/385
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move the -d nochain check to bits on tb->cflags.
These will be used for more than -d nochain shortly.
Set bits during curr_cflags, test them in translator_use_goto_tb,
assert we're not doing anything odd in tcg_gen_goto_tb. The test
in tcg_gen_exit_tb is redundant with the assert for goto_tb_issue_mask.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210717221851.2124573-4-richard.henderson@linaro.org>
Reduce the amount of code duplication by always passing
the TCGMemOpIdx argument to helper_atomic_*. This is not
currently used for user-only, but it's easy to ignore.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since 6eea04347e, all tcg backends support goto_ptr.
Remove the conditional, making support mandatory.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In tcg_region_prologue_set, we reset TCGContext.code_gen_ptr.
So do that after we've used it to dump the prologue contents.
Fixes: b0a0794a0f
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can call do_tb_phys_invalidate from an iocontext, which has
no per-thread tcg_ctx. Move this to tb_ctx, which is global.
The actual update still takes place with a lock held, so only
an atomic set is required, not an atomic increment.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/457
Tested-by: Viktor Ashirov <vashirov@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The function is called only at tcg_gen_code() when duplicated TBs
are translated by different threads, and when the tcg_region_tree
is reset. Bake it into the underlying GTree as its value destroy
function to unite these situations.
Also remove tcg_region_tree_traverse() which now becomes useless.
Signed-off-by: Liren Wei <lrwei@bupt.edu.cn>
Message-Id: <8dc352f08d038c4e7a1f5f56962398cdc700c3aa.1625404483.git.lrwei@bupt.edu.cn>
[rth: Name the new tb_tc_cmp parameter correctly.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We lost the ',' following the called function name.
Fixes: 3e92aa3443
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
TCG_TARGET_HAS_MEMORY_BSWAP is already unset for this backend,
which means that MO_BSWAP be handled by the middle-end and
will never be seen by the backend. Thus the indexes used with
qemu_{ld,st}_helpers will always be zero.
Tidy the comments and asserts in tcg_out_qemu_{ld,st}_direct.
It is not that we do not handle bswap "yet", but never will.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The memory bswap support in the aarch64 backend merely dates from
a time when it was required. There is nothing special about the
backend support that could not have been provided by the middle-end
even prior to the introduction of the bswap flags.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that the middle-end can replicate the same tricks as tcg/arm
used for optimizing bswap for signed loads and for stores, do not
pretend to have these memory ops in the backend.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
By removing TCG_BSWAP_IZ we indicate that the input is
not zero-extended, and thus can remove an explicit extend.
By removing TCG_BSWAP_OZ, we allow the implementation to
leave high bits set, which will be ignored by the store.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can perform any required sign-extension via TCG_BSWAP_OS.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Implement the new semantics in the fallback expansion.
Change all callers to supply the flags that keep the
semantics unchanged locally.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Notice when the input is known to be zero-extended and force
the TCG_BSWAP_IZ flag on. Honor the TCG_BSWAP_OS bit during
constant folding. Propagate the input to the output mask.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The existing interpreter zero-extends, ignoring high bits.
Simply add a separate sign-extension opcode if required.
Ensure that the interpreter supports ext16s when bswap16 is enabled.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_bswap32 and tcg_out_bswap32s.
Use the flags in the internal uses for loads and stores.
For mips32r2 bswap32 with zero-extension, standardize on
WSBH+ROTR+DEXT. This is the same number of insns as the
previous DSBH+DSHD+DSRL but fits in better with the flags check.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_bswap16 and tcg_out_bswap16s. Use the flags
in the internal uses for loads and stores.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For INDEX_op_bswap16_i64, use 64-bit instructions so that we can
easily provide the extension to 64-bits. Drop the special case,
previously used, where the input is already zero-extended -- the
minor code size savings is not worth the complication.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For INDEX_op_bswap32_i32, pass 0 for flags: input not zero-extended,
output does not need extension within the host 64-bit register.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With the use of a suitable temporary, we can use the same
algorithm when src overlaps dst. The result is the same
number of instructions either way.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will shortly require sari in other context;
split out both for cleanliness sake.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will shortly require these in other context;
make the expansion as clear as possible.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Combine the three bswap16 routines, and differentiate via the flags.
Use the correct flags combination from the load/store routines, and
pass along the constant parameter from tcg_out_op.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass in the input and output size. We currently use 3 of the 5
possible combinations; the others may be used by new tcg opcodes.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Retain the current rorw bswap16 expansion for the zero-in/zero-out case.
Otherwise, perform a wider bswap plus a right-shift or extend.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will eventually simplify front-end usage, and will allow
backends to unset TCG_TARGET_HAS_MEMORY_BSWAP without loss of
optimization.
The argument is added during expansion, not currently exposed to the
front end translators. The backends currently only support a flags
value of either TCG_BSWAP_IZ, or (TCG_BSWAP_IZ | TCG_BSWAP_OZ),
since they all require zero top bytes and leave them that way.
At the existing call sites we pass in (TCG_BSWAP_IZ | TCG_BSWAP_OZ),
except for the flags-ignored cases of a 32-bit swap of a 32-bit
value and or a 64-bit swap of a 64-bit value, where we pass 0.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The Arm MVE VDUP implementation would like to be able to emit code to
duplicate a byte or halfword value into an i32. We have code to do
this already in tcg-op-gvec.c, so all we need to do is make the
functions global.
For consistency with other functions made available to the frontends:
* we rename to tcg_gen_dup_*
* we expose both the _i32 and _i64 forms
* we provide the #define for a _tl form
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210617121628.20116-10-peter.maydell@linaro.org
Assume that we'll have fewer temps allocated after
restarting with a fewer number of instructions.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This function should have been updated for vector types
when they were introduced.
Fixes: d2fd745fe8
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/367
Cc: qemu-stable@nongnu.org
Tested-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We should not be aligning the offset in temp_allocate_frame,
because the odd offset produces an aligned address in the end.
Instead, pass the logical offset into tcg_set_frame and add
the stack bias last.
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Wrap guest memory operations for tci like we do for cpu_ld*_data.
We cannot actually use the cpu_ldst.h interface without duplicating
the memory trace operations performed within, which will already
have been expanded into the tcg opcode stream.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These macros are only used in one place. By expanding,
we get to apply some common-subexpression elimination
and create some local variables.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This reverts commit dc09f047ed.
For tcg, tracepoints are expanded inline in tcg opcodes.
Using a helper which generates a second tracepoint is incorrect.
For system mode, the extraction and re-packing of MemOp and mmu_idx
lost the alignment information from MemOp. So we were no longer
raising alignment exceptions for !TARGET_ALIGNED_ONLY guests.
This can be seen in tests/tcg/xtensa/test_load_store.S.
For user mode, we must update to the new signature of g2h() so that
the revert compiles. We can leave set_helper_retaddr for later.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can share this code between 32-bit and 64-bit loads and stores.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We already had the 32-bit versions for a 32-bit host; expand this
to 64-bit hosts as well. The 64-bit opcodes are new.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We already had mulu2_i32 for a 32-bit host; expand this to 64-bit
hosts as well. The muls2_i32 and the 64-bit opcodes are new.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These were already present in tcg-target.c.inc,
but not in the interpreter.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When this opcode is not available in the backend, tcg middle-end
will expand this as a series of 5 opcodes. So implementing this
saves bytecode space.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This operation is critical to staying within the interpretation
loop longer, which avoids the overhead of setup and teardown for
many TBs.
The check in tcg_prologue_init is disabled because TCI does
want to use NULL to indicate exit, as opposed to branching to
a real epilogue.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This removes all of the problems with unaligned accesses
to the bytecode stream.
With an 8-bit opcode at the bottom, we have 24 bits remaining,
which are generally split into 6 4-bit slots. This fits well
with the maximum length opcodes, e.g. INDEX_op_add2_i32, which
have 6 register operands.
We have, in previous patches, rearranged things such that there
are no operations with a label which have more than one other
operand. Which leaves us with a 20-bit field in which to encode
a label, giving us a maximum TB size of 512k -- easily large.
Change the INDEX_op_tci_movi_{i32,i64} opcodes to tci_mov[il].
The former puts the immediate in the upper 20 bits of the insn,
like we do for the label displacement. The later uses a label
to reference an entry in the constant pool. Thus, in the worst
case we still have a single memory reference for any constant,
but now the constants are out-of-line of the bytecode and can
be shared between different moves saving space.
Change INDEX_op_call to use a label to reference a pair of
pointers in the constant pool. This removes the only slightly
dodgy link with the layout of struct TCGHelperInfo.
The re-encode cannot be done in pieces.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Inline it into its one caller, tci_write_reg64.
Drop the asserts that are redundant with tcg_read_r.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>