Skip to content

Rebase on the latest #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 33 commits into
base: sync-upstream-wasmdev
Choose a base branch
from
Draft

Rebase on the latest #26

wants to merge 33 commits into from

Conversation

ktock
Copy link
Owner

@ktock ktock commented May 19, 2025

No description provided.

@ktock ktock marked this pull request as draft May 19, 2025 14:08
@ktock ktock force-pushed the sync-upstream-wasmdev branch from 49aa5b1 to 35462d6 Compare May 20, 2025 05:03
ktock added 6 commits May 20, 2025 14:11
Wasm backend is implemented based on the TCI backend and utilizes a forked
TCI to execute TBs.

Signed-off-by: Kohei Tokunaga <[email protected]>
Wasm backend should implement its own disassember for Wasm
instructions.

Signed-off-by: Kohei Tokunaga <[email protected]>
Now that there is a backend for WebAssembly build (/tcg/wasm32/), the
requirement of --enable-tcg-interpreter in meson.build can be removed.

Signed-off-by: Kohei Tokunaga <[email protected]>
WebAssembly instructions vary in size, including single-byte
instructions. This commit sets TCG_TARGET_INSN_UNIT_SIZE to 1 and updates
the TCI fork to use "tcg_insn_unit_tci" (a uint32_t) for 4-byte operations.

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements and, or and xor operations using Wasm
instructions. Each TCG variable is mapped to a 64bit Wasm variable. In Wasm,
and/or/xor instructions operate on values by first pushing the operands into
the Wasm's stack using get instructions. The result is left on the stack and
this can be assigned to a variable by popping it using a set instruction.

The Wasm binary format is documented at [1]. In this backend, TCI
instructions are emitted to s->code_ptr, while the corresponding Wasm
instructions are generated into a separated buffer allocated via
tcg_malloc(). These two code buffers must be merged into the final code
buffer before tcg_gen_code returns.

[1] https://webassembly.github.io/spec/core/binary/index.html

Signed-off-by: Kohei Tokunaga <[email protected]>
Add, sub and mul operations are implemented using the corresponding
instructions in Wasm.

Signed-off-by: Kohei Tokunaga <[email protected]>
ktock added 11 commits May 20, 2025 18:33
This commit implements shl, shr and sar operations using Wasm
instructions. The Wasm backend uses 64bit variables so the right shift
operation for 32bit values needs to extract the lower 32bit of the operand
before shifting. Additionally, since constant values must be encoded in
LEB128 format, this commit introduces an encoder function implemented
following [1].

[1] https://en.wikipedia.org/wiki/LEB128

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements setcond and movcond operations using Wasm's if/else
instructions. Support for TCG_COND_TSTEQ and TCG_COND_TSTNE is not yet
implemented, so TCG_TARGET_HAS_tst is set to 0.

Signed-off-by: Kohei Tokunaga <[email protected]>
This implements deposit, sextract and extract operations. The
tcg_out_[s]extract functions are used by several other functions
(e.g. tcg_out_ext*) and are intended to emit TCI code. So they have been
renamed to tcg_tci_out_[s]extract.

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements load and store operations using Wasm memory
instructions. Since Wasm's load/store instructions don't support negative
offset, address calculations are performed separately before the memory
access.

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements mov/movi instructions. The tcg_out_mov[i] functions
are used by several other functions and are intended to emit TCI code. So
they have been renamed to tcg_tci_out_mov[i].

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements the ext operations using Wasm's extend instructions.

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements the bswap operation using Wasm instructions.

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements rem and div operations using Wasm's rem/div
instructions.

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements andc, orc, eqv, nand and nor operations using Wasm
instructions.

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements neg, not and ctpop operations using Wasm
instructions.

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements rot, clz and ctz operations using Wasm instructions.

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements addc and subb operations using Wasm instructions. A
carry flag is introduced as the 16th variable in the module following other
15 variables that represent TCG variables.

Signed-off-by: Kohei Tokunaga <[email protected]>
@ktock ktock force-pushed the dev-wasm-j branch 3 times, most recently from 876e479 to 16f1587 Compare May 20, 2025 11:09
ktock added 3 commits May 20, 2025 21:23
Wasm does not support direct jumps to arbitrary code addresses, so
label-based control flow is implemented using Wasm's control flow
instructions. As illustrated in the pseudo-code below, each TB wraps its
instructions inside a large loop. Each set of codes separated by labels is
placed inside an "if" block. Br is implemented by breaking out of the
current block and conditionally entering the target block:

loop
  if
    ... code after label1
  end
  if
    ... code after label2
  end
  ...
end

Each block within the TB is assigned a unique int32 ID. The topmost "if"
block is assigned ID 0, and subsequent blocks are assigned incrementally.

To control br, this commit introduces a 17th Wasm variable BLOCK_PTR_IDX
which holds the ID of the target block. The br instruction sets this
variable to the target block's ID, breaks from the current if block, and
allows the control flow to move forward. Each if block checks whether the
BLOCK_PTR_IDX variable matches its assigned ID. If it does, execution
proceeds within that block.

The start of the global loop and the first if block is generated in
tcg_out_tb_start. To properly close the blocks, this commit also introduces
a new TCG backend callback tcg_out_tb_end which emits the "end" instructions
for the final if block and the loop block in the Wasm backend.

Another new callback tcg_out_label_cb is used to emit block boundaries,
specifically the end of the previous block and the if of the next block, at
label positions. In this callback, the mapping between label IDs and block
IDs is recorded in LabelInfo, which is later used to resolve br
instructions.

Since the block ID for a label might not be known at the time a br
instruction is generated, a placeholder (longer than 32bit and encoded as
LEB128) is emitted instead. These placeholders are tracked in
BlockPlaceholder and resolved later.

Signed-off-by: Kohei Tokunaga <[email protected]>
In the Wasm backend, each TB is compiled to a separeted Wasm
module. Control transfer between TBs (i.e. from one Wasm module to
another) is handled by the caller of the module.

The goto_tb and goto_ptr operations are implemented by returning
control to the caller using the return instruction. The destination
TB's pointer is passed to the caller via a shared wasmContext
structure which is accessible from both the Wasm module and the caller. This
wasmContext must be provided to the module as an argument which is
accessible as the local variable at index 0.

If the destination TB is the current TB itself, there is no need to
return control to the caller. Instead, execution can jump directly to
the top of the loop within the TB.

The exit_tb operation sets the pointer in wasmContext to 0, indicating that
there is no destination TB.

Signed-off-by: Kohei Tokunaga <[email protected]>
To call QEMU functions from a TB (i.e. a Wasm module), those functions must
be imported into the module.

Wasm's call instruction can invoke an imported function using a locally
assigned function index. When a call TCG operation is generated, the Wasm
backend assigns a unique ID (starting from 0) to the target function. The
mapping between the function pointer and its assigned ID is recorded in the
HelperInfo structure.

Since Wasm's call instruction requires arguments to be pushed onto the Wasm
stack, the backend retrieves the function arguments from TCG's stack array
and pushes them to the stack before the call. After the function returns,
the result is retrieved from the stack and set in the corresponding TCG
variable.

In our Emscripten build configuration with !has_int128_type, a 128-bit value
is represented by the Int128 struct. These values are passed indirectly via
pointer parameters and returned via a prepended pointer argument, as
described in [1].

[1] https://github.com/WebAssembly/tool-conventions/blob/060cf4073e46931160c2e9ecd43177ee1fe93866/BasicCABI.md#function-arguments-and-return-values

Signed-off-by: Kohei Tokunaga <[email protected]>
ktock added 12 commits May 20, 2025 21:26
This commit adds qemu_ld and qemu_st by calling the helper functions
corresponding to MemOp.

Signed-off-by: Kohei Tokunaga <[email protected]>
To enable 64-bit guest support in Wasm 32bit memory model today, it was
necessary to partially revert recent changes that removed support for
different pointer widths between the host and guest (e.g. commits
a70af12 and
bf455ec) when compiling with
Emscripten. While this serves as a temporary workaround, a long-term
solution could involve adopting Wasm's 64-bit memory model once it gains
broader support, as it is currently not widely adopted (e.g. unsupported by
Safari and libffi).

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit enables to Wasm backend to run as a 64bit backend with removing
TCG_TARGET_REG_BITS = 32 macros.

Signed-off-by: Kohei Tokunaga <[email protected]>
These operations have no direct equivalents in Wasm, so they are left
unimplemented and delegated to helper functions.

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit adds initialization of TCG_AREG0 and TCG_REG_CALL_STACK at the
beginning of each TB. The CPUArchState struct and the stack array are passed
from the caller via the wasmContext structure. Since TB execution begins at
the first block, the BLOCK_PTR_IDX variable is initialized to 0.

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit updates tcg_out_tb_start and tcg_out_tb_end to emit Wasm
binaries into the TB code buffer. The generated Wasm binary defines a
function of type wasm_tb_func which takes a wasmContext, executes the TB,
and returns a result. In the Wasm backend, each TB starts with a
wasmTBHeader, followed by the following data:

- TCI code
- Wasm code
- Array of function indices imported into the Wasm instance

The wasmTBHeader contains pointers to each of these elements.

tcg_out_tb_start writes the wasmTBHeader to the code buffer. tcg_out_tb_end
generates the full Wasm executable binary by creating the Wasm module header
following the spec[1][2] and copying the Wasm code body from sub_buf to the
code buffer. Wasm binary is placed after the TCI code which was emitted
earlier.

Additionally, an array of imported function pointers is appended to the TB.
They are used during Wasm module instantiation. Function are imported to
Wasm with names like "helper.0", "helper.1", etc., where the number
corresponds to the assigned function IDs.

Each function's type signature must also be encoded in the Wasm module header.
To support this, each call, qemu_ld and qemu_st operation records the target
function's type information to a buffer.

Memory is shared between QEMU and the TBs and is imported to the Wasm module
with the name "env.buffer".

[1] https://webassembly.github.io/spec/core/binary/modules.html
[2] https://github.com/WebAssembly/threads/blob/b2567bff61ee6fbe731934f0ed17a5d48dc9ab01/proposals/threads/Overview.md

Signed-off-by: Kohei Tokunaga <[email protected]>
instantiate_wasm is a function that instantiates a TB's Wasm binary,
importing the functions as specified by its arguments. Following the header
definition in wasm32/tcg-target.c.inc, QEMU's memory is imported into the
module as "env.buffer", and helper functions are imported as
"helper.<id>". The instantiated Wasm module is imported to QEMU using
Emscripten's "addFunction" feature[1] which returns a function pointer. This
allows QEMU to call this module directly from C code via that pointer.

Note Since FireFox 138, WebAssembly.Module no longer accepts a
SharedArrayBuffer as input [2] as reported by Nicolas Vandeginste in my
downstream fork[3]. This commit ensures that WebAssembly.Module() is passed
a Uint8Array created from the binary data on a SharedArrayBuffer.

[1] https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html#calling-javascript-functions-as-function-pointers-from-c
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1965217
[3] #25

Signed-off-by: Kohei Tokunaga <[email protected]>
Emscripten's Fiber coroutine implements coroutine switching using the stack
unwinding and rewinding capabilities of Asyncify [1]. When a coroutine
yields (i.e. switches out), Asyncify unwinds the stack, returning control to
Emscripten's JS code (Fiber.trampoline()), which then performs stack
rewinding to resume execution in the target coroutine. Stack unwinding is
implemented by a sequence of immediate function returns, while rewinding
works by re-entering the functions in the call stack, skipping any code
between the top of the function and the original call position [2].

This commit modifies the Wasm TB modules to support Fiber
coroutines. Assuming the TCG CPU loop is executed by only one coroutine per
thread, a TB module must allow helper functions to unwind and be resumed via
rewinding.

Specifically:

- When a helper returns due to an unwind, the module must immediately return
  to its caller, allowing unwinding to propagate.
- When being called again for a rewind, the module must skip any code
  between the top of the function and the call position that triggered the
  unwind, and directly enter the helper.

To support this:

- TBs now check the Asyncify.state JS object after each helper call. If
  unwinding is in progress, the TB immediately returns control to the
  caller.
- Each function call is preceded by a block boundary and an update of the
  BLOCK_PTR_IDX variable. This enables the TB to re-enter execution at the
  correct point during a rewind, skipping earlier blocks.

Additionally, this commit introduces wasmContext.do_init which is a flag
indicating whether the TB should reset the BLOCK_PTR_IDX variable to 0
(i.e. start from the beginning). In call_wasm_tb, this is always set
(ctx.do_init = 1) to ensure normal TB execution begins at the first
block. Once the TB resets the BLOCK_PTR_IDX variable, it also clears
do_init. During a rewind, the C code does not set ctx.do_init to 1, allowing
the TB to preserve the BLOCK_PTR_IDX value from the previous unwind and
correctly resume execution from the last unwound block.

[1] https://emscripten.org/docs/api_reference/fiber.h.html
[2] https://kripken.github.io/blog/wasm/2019/07/16/asyncify.html#new-asyncify

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit enables instantiations of TBs in wasm32.c. Browsers cause out of
memory error if too many Wasm instances are created so the number of
instances needs to be limited. So this commit restricts instantiation only
for TBs that are called many times.

This commit adds a counter (or its array if there are multiple threads) to
the TB. Each time a TB is executed on TCI, the counter on TB is
incremented. If it reaches to a threshold, that TB is instantiated as Wasm
via instantiate_wasm.

The total number of instances are tracked by the instances_global variable
and its max number is limited by MAX_INSTANCES. When a Wasm module is
instantiated, instances_global is incremented and the instance's function
pointer is recorded to an array of wasmInstanceInfo.

Each TB refers to the wasmInstanceInfo via wasmTBHeader's info_ptr (or its
array if there are multiple threads). This allows tcg_qemu_tb_exec to
resolve the instance function pointer from TB.

When a new instantiation risks exceeding the limit, the Wasm backend doesn't
perform the instantiation (i.e. TB continues to be executed on TCI),
instead, removal of older Wasm instances is triggered using Emscripten's
removeFunction function. Once the removal of the instance is detected via
FinalizationRegistry API[1], instances_global is decremented, which allows
instantiation of new modules again.

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/FinalizationRegistry

Signed-off-by: Kohei Tokunaga <[email protected]>
This commit enables qemu_ld and qemu_st to perform TLB lookups, following
the approach used in other backends such as RISC-V. Unlike other backends,
the Wasm backend cannot use ldst labels, as jumping to specific code
addresses (e.g. raddr) is not possible in Wasm. Instead, each TLB lookup is
followed by a if branch: if the lookup succeeds, the memory is accessed
directly; otherwise, a fallback helper function is invoked. Support for
MO_BSWAP is not yet implemented, so has_memory_bswap is set to false.

Signed-off-by: Kohei Tokunaga <[email protected]>
Emscripten uses the optimization flag at link time to enable optimizations
via Binaryen [1]. While meson.build currently recognizes the -Doptimization
option, it does not propagate it to the linking. This commit updates
meson.build to propagate the optimization flag to the linking when targeting
WebAssembly.

[1] https://emscripten.org/docs/optimizing/Optimizing-Code.html#how-emscripten-optimizes

Signed-off-by: Kohei Tokunaga <[email protected]>
Check if wasm backend can be built in CI.

Signed-off-by: Kohei Tokunaga <[email protected]>
@ktock ktock mentioned this pull request May 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant