6508988918
This removes all of the problems with unaligned accesses to the bytecode stream. With an 8-bit opcode at the bottom, we have 24 bits remaining, which are generally split into 6 4-bit slots. This fits well with the maximum length opcodes, e.g. INDEX_op_add2_i32, which have 6 register operands. We have, in previous patches, rearranged things such that there are no operations with a label which have more than one other operand. Which leaves us with a 20-bit field in which to encode a label, giving us a maximum TB size of 512k -- easily large. Change the INDEX_op_tci_movi_{i32,i64} opcodes to tci_mov[il]. The former puts the immediate in the upper 20 bits of the insn, like we do for the label displacement. The later uses a label to reference an entry in the constant pool. Thus, in the worst case we still have a single memory reference for any constant, but now the constants are out-of-line of the bytecode and can be shared between different moves saving space. Change INDEX_op_call to use a label to reference a pair of pointers in the constant pool. This removes the only slightly dodgy link with the layout of struct TCGHelperInfo. The re-encode cannot be done in pieces. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
121 lines
4.2 KiB
Plaintext
121 lines
4.2 KiB
Plaintext
TCG Interpreter (TCI) - Copyright (c) 2011 Stefan Weil.
|
|
|
|
This file is released under the BSD license.
|
|
|
|
1) Introduction
|
|
|
|
TCG (Tiny Code Generator) is a code generator which translates
|
|
code fragments ("basic blocks") from target code (any of the
|
|
targets supported by QEMU) to a code representation which
|
|
can be run on a host.
|
|
|
|
QEMU can create native code for some hosts (arm, i386, ia64, ppc, ppc64,
|
|
s390, sparc, x86_64). For others, unofficial host support was written.
|
|
|
|
By adding a code generator for a virtual machine and using an
|
|
interpreter for the generated bytecode, it is possible to
|
|
support (almost) any host.
|
|
|
|
This is what TCI (Tiny Code Interpreter) does.
|
|
|
|
2) Implementation
|
|
|
|
Like each TCG host frontend, TCI implements the code generator in
|
|
tcg-target.c.inc, tcg-target.h. Both files are in directory tcg/tci.
|
|
|
|
The additional file tcg/tci.c adds the interpreter and disassembler.
|
|
|
|
The bytecode consists of opcodes (with only a few exceptions, with
|
|
the same same numeric values and semantics as used by TCG), and up
|
|
to six arguments packed into a 32-bit integer. See comments in tci.c
|
|
for details on the encoding.
|
|
|
|
3) Usage
|
|
|
|
For hosts without native TCG, the interpreter TCI must be enabled by
|
|
|
|
configure --enable-tcg-interpreter
|
|
|
|
If configure is called without --enable-tcg-interpreter, it will
|
|
suggest using this option. Setting it automatically would need
|
|
additional code in configure which must be fixed when new native TCG
|
|
implementations are added.
|
|
|
|
For hosts with native TCG, the interpreter TCI can be enabled by
|
|
|
|
configure --enable-tcg-interpreter
|
|
|
|
The only difference from running QEMU with TCI to running without TCI
|
|
should be speed. Especially during development of TCI, it was very
|
|
useful to compare runs with and without TCI. Create /tmp/qemu.log by
|
|
|
|
qemu-system-i386 -d in_asm,op_opt,cpu -D /tmp/qemu.log -singlestep
|
|
|
|
once with interpreter and once without interpreter and compare the resulting
|
|
qemu.log files. This is also useful to see the effects of additional
|
|
registers or additional opcodes (it is easy to modify the virtual machine).
|
|
It can also be used to verify native TCGs.
|
|
|
|
Hosts with native TCG can also enable TCI by claiming to be unsupported:
|
|
|
|
configure --cpu=unknown --enable-tcg-interpreter
|
|
|
|
configure then no longer uses the native linker script (*.ld) for
|
|
user mode emulation.
|
|
|
|
|
|
4) Status
|
|
|
|
TCI needs special implementation for 32 and 64 bit host, 32 and 64 bit target,
|
|
host and target with same or different endianness.
|
|
|
|
| host (le) host (be)
|
|
| 32 64 32 64
|
|
------------+------------------------------------------------------------
|
|
target (le) | s0, u0 s1, u1 s?, u? s?, u?
|
|
32 bit |
|
|
|
|
|
target (le) | sc, uc s1, u1 s?, u? s?, u?
|
|
64 bit |
|
|
|
|
|
target (be) | sc, u0 sc, uc s?, u? s?, u?
|
|
32 bit |
|
|
|
|
|
target (be) | sc, uc sc, uc s?, u? s?, u?
|
|
64 bit |
|
|
|
|
|
|
|
System emulation
|
|
s? = untested
|
|
sc = compiles
|
|
s0 = bios works
|
|
s1 = grub works
|
|
s2 = Linux boots
|
|
|
|
Linux user mode emulation
|
|
u? = untested
|
|
uc = compiles
|
|
u0 = static hello works
|
|
u1 = linux-user-test works
|
|
|
|
5) Todo list
|
|
|
|
* TCI is not widely tested. It was written and tested on a x86_64 host
|
|
running i386 and x86_64 system emulation and Linux user mode.
|
|
A cross compiled QEMU for i386 host also works with the same basic tests.
|
|
A cross compiled QEMU for mipsel host works, too. It is terribly slow
|
|
because I run it in a mips malta emulation, so it is an interpreted
|
|
emulation in an emulation.
|
|
A cross compiled QEMU for arm host works (tested with pc bios).
|
|
A cross compiled QEMU for ppc host works at least partially:
|
|
i386-linux-user/qemu-i386 can run a simple hello-world program
|
|
(tested in a ppc emulation).
|
|
|
|
* Some TCG opcodes are either missing in the code generator and/or
|
|
in the interpreter. These opcodes raise a runtime exception, so it is
|
|
possible to see where code must be added.
|
|
|
|
* It might be useful to have a runtime option which selects the native TCG
|
|
or TCI, so QEMU would have to include two TCGs. Today, selecting TCI
|
|
is a configure option, so you need two compilations of QEMU.
|