X-Git-Url: https://repo.jachan.dev/qemu.git/blobdiff_plain/6b2f90fbbd31d594238098f46ef63ee307a12f55..36017dc68aa8c345d10ad7ba7bc3dba580f3f035:/tcg/README diff --git a/tcg/README b/tcg/README index aa86992bca..063aeb95ea 100644 --- a/tcg/README +++ b/tcg/README @@ -14,6 +14,10 @@ the emulated architecture. As TCG started as a generic C backend used for cross compiling, it is assumed that the TCG target is different from the host, although it is never the case for QEMU. +In this document, we use "guest" to specify what architecture we are +emulating; "target" always means the TCG target, the machine on which +we are running QEMU. + A TCG "function" corresponds to a QEMU Translated Block (TB). A TCG "temporary" is a variable only live in a basic @@ -77,11 +81,20 @@ destroyed, but local temporaries and globals are preserved. Using the tcg_gen_helper_x_y it is possible to call any function taking i32, i64 or pointer types. By default, before calling a helper, all globals are stored at their canonical location and it is assumed -that the function can modify them. This can be overridden by the -TCG_CALL_CONST function modifier. By default, the helper is allowed to -modify the CPU state or raise an exception. This can be overridden by -the TCG_CALL_PURE function modifier, in which case the call to the -function is removed if the return value is not used. +that the function can modify them. By default, the helper is allowed to +modify the CPU state or raise an exception. + +This can be overridden using the following function modifiers: +- TCG_CALL_NO_READ_GLOBALS means that the helper does not read globals, + either directly or via an exception. They will not be saved to their + canonical locations before calling the helper. +- TCG_CALL_NO_WRITE_GLOBALS means that the helper does not modify any globals. + They will only be saved to their canonical location before calling helpers, + but they won't be reloaded afterwise. +- TCG_CALL_NO_SIDE_EFFECTS means that the call to the function is removed if + the return value is not used. + +Note that TCG_CALL_NO_READ_GLOBALS implies TCG_CALL_NO_WRITE_GLOBALS. On some TCG targets (e.g. x86), several calling conventions are supported. @@ -349,7 +362,28 @@ st32_i64 t0, t1, offset write(t0, t1 + offset) Write 8, 16, 32 or 64 bits to host memory. -********* 64-bit target on 32-bit host support +All this opcodes assume that the pointed host memory doesn't correspond +to a global. In the latter case the behaviour is unpredictable. + +********* Multiword arithmetic support + +* add2_i32/i64 t0_low, t0_high, t1_low, t1_high, t2_low, t2_high +* sub2_i32/i64 t0_low, t0_high, t1_low, t1_high, t2_low, t2_high + +Similar to add/sub, except that the double-word inputs T1 and T2 are +formed from two single-word arguments, and the double-word output T0 +is returned in two single-word outputs. + +* mulu2_i32/i64 t0_low, t0_high, t1, t2 + +Similar to mul, except two unsigned inputs T1 and T2 yielding the full +double-word product T0. The later is returned in two single-word outputs. + +* muls2_i32/i64 t0_low, t0_high, t1, t2 + +Similar to mulu2, except the two inputs T1 and T2 are signed. + +********* 64-bit guest on 32-bit host support The following opcodes are internal to TCG. Thus they are to be implemented by 32-bit host code generators, but are not to be emitted by guest translators. @@ -360,18 +394,6 @@ They are emitted as needed by inline functions within "tcg-op.h". Similar to brcond, except that the 64-bit values T0 and T1 are formed from two 32-bit arguments. -* add2_i32 t0_low, t0_high, t1_low, t1_high, t2_low, t2_high -* sub2_i32 t0_low, t0_high, t1_low, t1_high, t2_low, t2_high - -Similar to add/sub, except that the 64-bit inputs T1 and T2 are -formed from two 32-bit arguments, and the 64-bit output T0 -is returned in two 32-bit outputs. - -* mulu2_i32 t0_low, t0_high, t1, t2 - -Similar to mul, except two 32-bit (unsigned) inputs T1 and T2 yielding -the full 64-bit product T0. The later is returned in two 32-bit outputs. - * setcond2_i32 dest, t1_low, t1_high, t2_low, t2_high, cond Similar to setcond, except that the 64-bit values T1 and T2 are @@ -503,9 +525,9 @@ register. a better generated code, but it reduces the memory usage of TCG and the speed of the translation. -- Don't hesitate to use helpers for complicated or seldom used target +- Don't hesitate to use helpers for complicated or seldom used guest instructions. There is little performance advantage in using TCG to - implement target instructions taking more than about twenty TCG + implement guest instructions taking more than about twenty TCG instructions. Note that this rule of thumb is more applicable to helpers doing complex logic or arithmetic, where the C compiler has scope to do a good job of optimisation; it is less relevant where @@ -513,9 +535,9 @@ register. inline TCG may still be faster for longer sequences. - The hard limit on the number of TCG instructions you can generate - per target instruction is set by MAX_OP_PER_INSTR in exec-all.h -- + per guest instruction is set by MAX_OP_PER_INSTR in exec-all.h -- you cannot exceed this without risking a buffer overrun. - Use the 'discard' instruction if you know that TCG won't be able to prove that a given global is "dead" at a given program point. The - x86 target uses it to improve the condition codes optimisation. + x86 guest uses it to improve the condition codes optimisation.