for cross compiling, it is assumed that the TCG target is different
from the host, although it is never the case for QEMU.
+In this document, we use "guest" to specify what architecture we are
+emulating; "target" always means the TCG target, the machine on which
+we are running QEMU.
+
A TCG "function" corresponds to a QEMU Translated Block (TB).
A TCG "temporary" is a variable only live in a basic
All this opcodes assume that the pointed host memory doesn't correspond
to a global. In the latter case the behaviour is unpredictable.
-********* 64-bit target on 32-bit host support
+********* Multiword arithmetic support
+
+* add2_i32/i64 t0_low, t0_high, t1_low, t1_high, t2_low, t2_high
+* sub2_i32/i64 t0_low, t0_high, t1_low, t1_high, t2_low, t2_high
+
+Similar to add/sub, except that the double-word inputs T1 and T2 are
+formed from two single-word arguments, and the double-word output T0
+is returned in two single-word outputs.
+
+* mulu2_i32/i64 t0_low, t0_high, t1, t2
+
+Similar to mul, except two unsigned inputs T1 and T2 yielding the full
+double-word product T0. The later is returned in two single-word outputs.
+
+* muls2_i32/i64 t0_low, t0_high, t1, t2
+
+Similar to mulu2, except the two inputs T1 and T2 are signed.
+
+********* 64-bit guest on 32-bit host support
The following opcodes are internal to TCG. Thus they are to be implemented by
32-bit host code generators, but are not to be emitted by guest translators.
Similar to brcond, except that the 64-bit values T0 and T1
are formed from two 32-bit arguments.
-* add2_i32 t0_low, t0_high, t1_low, t1_high, t2_low, t2_high
-* sub2_i32 t0_low, t0_high, t1_low, t1_high, t2_low, t2_high
-
-Similar to add/sub, except that the 64-bit inputs T1 and T2 are
-formed from two 32-bit arguments, and the 64-bit output T0
-is returned in two 32-bit outputs.
-
-* mulu2_i32 t0_low, t0_high, t1, t2
-
-Similar to mul, except two 32-bit (unsigned) inputs T1 and T2 yielding
-the full 64-bit product T0. The later is returned in two 32-bit outputs.
-
* setcond2_i32 dest, t1_low, t1_high, t2_low, t2_high, cond
Similar to setcond, except that the 64-bit values T1 and T2 are
a better generated code, but it reduces the memory usage of TCG and
the speed of the translation.
-- Don't hesitate to use helpers for complicated or seldom used target
+- Don't hesitate to use helpers for complicated or seldom used guest
instructions. There is little performance advantage in using TCG to
- implement target instructions taking more than about twenty TCG
+ implement guest instructions taking more than about twenty TCG
instructions. Note that this rule of thumb is more applicable to
helpers doing complex logic or arithmetic, where the C compiler has
scope to do a good job of optimisation; it is less relevant where
inline TCG may still be faster for longer sequences.
- The hard limit on the number of TCG instructions you can generate
- per target instruction is set by MAX_OP_PER_INSTR in exec-all.h --
+ per guest instruction is set by MAX_OP_PER_INSTR in exec-all.h --
you cannot exceed this without risking a buffer overrun.
- Use the 'discard' instruction if you know that TCG won't be able to
prove that a given global is "dead" at a given program point. The
- x86 target uses it to improve the condition codes optimisation.
+ x86 guest uses it to improve the condition codes optimisation.