-
Notifications
You must be signed in to change notification settings - Fork 5
Rebase on the latest #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
ktock
wants to merge
33
commits into
sync-upstream-wasmdev
Choose a base branch
from
dev-wasm-j
base: sync-upstream-wasmdev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
49aa5b1
to
35462d6
Compare
Wasm backend is implemented based on the TCI backend and utilizes a forked TCI to execute TBs. Signed-off-by: Kohei Tokunaga <[email protected]>
Wasm backend should implement its own disassember for Wasm instructions. Signed-off-by: Kohei Tokunaga <[email protected]>
Now that there is a backend for WebAssembly build (/tcg/wasm32/), the requirement of --enable-tcg-interpreter in meson.build can be removed. Signed-off-by: Kohei Tokunaga <[email protected]>
WebAssembly instructions vary in size, including single-byte instructions. This commit sets TCG_TARGET_INSN_UNIT_SIZE to 1 and updates the TCI fork to use "tcg_insn_unit_tci" (a uint32_t) for 4-byte operations. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements and, or and xor operations using Wasm instructions. Each TCG variable is mapped to a 64bit Wasm variable. In Wasm, and/or/xor instructions operate on values by first pushing the operands into the Wasm's stack using get instructions. The result is left on the stack and this can be assigned to a variable by popping it using a set instruction. The Wasm binary format is documented at [1]. In this backend, TCI instructions are emitted to s->code_ptr, while the corresponding Wasm instructions are generated into a separated buffer allocated via tcg_malloc(). These two code buffers must be merged into the final code buffer before tcg_gen_code returns. [1] https://webassembly.github.io/spec/core/binary/index.html Signed-off-by: Kohei Tokunaga <[email protected]>
Add, sub and mul operations are implemented using the corresponding instructions in Wasm. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements shl, shr and sar operations using Wasm instructions. The Wasm backend uses 64bit variables so the right shift operation for 32bit values needs to extract the lower 32bit of the operand before shifting. Additionally, since constant values must be encoded in LEB128 format, this commit introduces an encoder function implemented following [1]. [1] https://en.wikipedia.org/wiki/LEB128 Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements setcond and movcond operations using Wasm's if/else instructions. Support for TCG_COND_TSTEQ and TCG_COND_TSTNE is not yet implemented, so TCG_TARGET_HAS_tst is set to 0. Signed-off-by: Kohei Tokunaga <[email protected]>
This implements deposit, sextract and extract operations. The tcg_out_[s]extract functions are used by several other functions (e.g. tcg_out_ext*) and are intended to emit TCI code. So they have been renamed to tcg_tci_out_[s]extract. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements load and store operations using Wasm memory instructions. Since Wasm's load/store instructions don't support negative offset, address calculations are performed separately before the memory access. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements mov/movi instructions. The tcg_out_mov[i] functions are used by several other functions and are intended to emit TCI code. So they have been renamed to tcg_tci_out_mov[i]. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements the ext operations using Wasm's extend instructions. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements the bswap operation using Wasm instructions. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements rem and div operations using Wasm's rem/div instructions. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements andc, orc, eqv, nand and nor operations using Wasm instructions. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements neg, not and ctpop operations using Wasm instructions. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements rot, clz and ctz operations using Wasm instructions. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit implements addc and subb operations using Wasm instructions. A carry flag is introduced as the 16th variable in the module following other 15 variables that represent TCG variables. Signed-off-by: Kohei Tokunaga <[email protected]>
876e479
to
16f1587
Compare
Wasm does not support direct jumps to arbitrary code addresses, so label-based control flow is implemented using Wasm's control flow instructions. As illustrated in the pseudo-code below, each TB wraps its instructions inside a large loop. Each set of codes separated by labels is placed inside an "if" block. Br is implemented by breaking out of the current block and conditionally entering the target block: loop if ... code after label1 end if ... code after label2 end ... end Each block within the TB is assigned a unique int32 ID. The topmost "if" block is assigned ID 0, and subsequent blocks are assigned incrementally. To control br, this commit introduces a 17th Wasm variable BLOCK_PTR_IDX which holds the ID of the target block. The br instruction sets this variable to the target block's ID, breaks from the current if block, and allows the control flow to move forward. Each if block checks whether the BLOCK_PTR_IDX variable matches its assigned ID. If it does, execution proceeds within that block. The start of the global loop and the first if block is generated in tcg_out_tb_start. To properly close the blocks, this commit also introduces a new TCG backend callback tcg_out_tb_end which emits the "end" instructions for the final if block and the loop block in the Wasm backend. Another new callback tcg_out_label_cb is used to emit block boundaries, specifically the end of the previous block and the if of the next block, at label positions. In this callback, the mapping between label IDs and block IDs is recorded in LabelInfo, which is later used to resolve br instructions. Since the block ID for a label might not be known at the time a br instruction is generated, a placeholder (longer than 32bit and encoded as LEB128) is emitted instead. These placeholders are tracked in BlockPlaceholder and resolved later. Signed-off-by: Kohei Tokunaga <[email protected]>
In the Wasm backend, each TB is compiled to a separeted Wasm module. Control transfer between TBs (i.e. from one Wasm module to another) is handled by the caller of the module. The goto_tb and goto_ptr operations are implemented by returning control to the caller using the return instruction. The destination TB's pointer is passed to the caller via a shared wasmContext structure which is accessible from both the Wasm module and the caller. This wasmContext must be provided to the module as an argument which is accessible as the local variable at index 0. If the destination TB is the current TB itself, there is no need to return control to the caller. Instead, execution can jump directly to the top of the loop within the TB. The exit_tb operation sets the pointer in wasmContext to 0, indicating that there is no destination TB. Signed-off-by: Kohei Tokunaga <[email protected]>
To call QEMU functions from a TB (i.e. a Wasm module), those functions must be imported into the module. Wasm's call instruction can invoke an imported function using a locally assigned function index. When a call TCG operation is generated, the Wasm backend assigns a unique ID (starting from 0) to the target function. The mapping between the function pointer and its assigned ID is recorded in the HelperInfo structure. Since Wasm's call instruction requires arguments to be pushed onto the Wasm stack, the backend retrieves the function arguments from TCG's stack array and pushes them to the stack before the call. After the function returns, the result is retrieved from the stack and set in the corresponding TCG variable. In our Emscripten build configuration with !has_int128_type, a 128-bit value is represented by the Int128 struct. These values are passed indirectly via pointer parameters and returned via a prepended pointer argument, as described in [1]. [1] https://github.com/WebAssembly/tool-conventions/blob/060cf4073e46931160c2e9ecd43177ee1fe93866/BasicCABI.md#function-arguments-and-return-values Signed-off-by: Kohei Tokunaga <[email protected]>
This commit adds qemu_ld and qemu_st by calling the helper functions corresponding to MemOp. Signed-off-by: Kohei Tokunaga <[email protected]>
To enable 64-bit guest support in Wasm 32bit memory model today, it was necessary to partially revert recent changes that removed support for different pointer widths between the host and guest (e.g. commits a70af12 and bf455ec) when compiling with Emscripten. While this serves as a temporary workaround, a long-term solution could involve adopting Wasm's 64-bit memory model once it gains broader support, as it is currently not widely adopted (e.g. unsupported by Safari and libffi). Signed-off-by: Kohei Tokunaga <[email protected]>
This commit enables to Wasm backend to run as a 64bit backend with removing TCG_TARGET_REG_BITS = 32 macros. Signed-off-by: Kohei Tokunaga <[email protected]>
These operations have no direct equivalents in Wasm, so they are left unimplemented and delegated to helper functions. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit adds initialization of TCG_AREG0 and TCG_REG_CALL_STACK at the beginning of each TB. The CPUArchState struct and the stack array are passed from the caller via the wasmContext structure. Since TB execution begins at the first block, the BLOCK_PTR_IDX variable is initialized to 0. Signed-off-by: Kohei Tokunaga <[email protected]>
This commit updates tcg_out_tb_start and tcg_out_tb_end to emit Wasm binaries into the TB code buffer. The generated Wasm binary defines a function of type wasm_tb_func which takes a wasmContext, executes the TB, and returns a result. In the Wasm backend, each TB starts with a wasmTBHeader, followed by the following data: - TCI code - Wasm code - Array of function indices imported into the Wasm instance The wasmTBHeader contains pointers to each of these elements. tcg_out_tb_start writes the wasmTBHeader to the code buffer. tcg_out_tb_end generates the full Wasm executable binary by creating the Wasm module header following the spec[1][2] and copying the Wasm code body from sub_buf to the code buffer. Wasm binary is placed after the TCI code which was emitted earlier. Additionally, an array of imported function pointers is appended to the TB. They are used during Wasm module instantiation. Function are imported to Wasm with names like "helper.0", "helper.1", etc., where the number corresponds to the assigned function IDs. Each function's type signature must also be encoded in the Wasm module header. To support this, each call, qemu_ld and qemu_st operation records the target function's type information to a buffer. Memory is shared between QEMU and the TBs and is imported to the Wasm module with the name "env.buffer". [1] https://webassembly.github.io/spec/core/binary/modules.html [2] https://github.com/WebAssembly/threads/blob/b2567bff61ee6fbe731934f0ed17a5d48dc9ab01/proposals/threads/Overview.md Signed-off-by: Kohei Tokunaga <[email protected]>
instantiate_wasm is a function that instantiates a TB's Wasm binary, importing the functions as specified by its arguments. Following the header definition in wasm32/tcg-target.c.inc, QEMU's memory is imported into the module as "env.buffer", and helper functions are imported as "helper.<id>". The instantiated Wasm module is imported to QEMU using Emscripten's "addFunction" feature[1] which returns a function pointer. This allows QEMU to call this module directly from C code via that pointer. Note Since FireFox 138, WebAssembly.Module no longer accepts a SharedArrayBuffer as input [2] as reported by Nicolas Vandeginste in my downstream fork[3]. This commit ensures that WebAssembly.Module() is passed a Uint8Array created from the binary data on a SharedArrayBuffer. [1] https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html#calling-javascript-functions-as-function-pointers-from-c [2] https://bugzilla.mozilla.org/show_bug.cgi?id=1965217 [3] #25 Signed-off-by: Kohei Tokunaga <[email protected]>
Emscripten's Fiber coroutine implements coroutine switching using the stack unwinding and rewinding capabilities of Asyncify [1]. When a coroutine yields (i.e. switches out), Asyncify unwinds the stack, returning control to Emscripten's JS code (Fiber.trampoline()), which then performs stack rewinding to resume execution in the target coroutine. Stack unwinding is implemented by a sequence of immediate function returns, while rewinding works by re-entering the functions in the call stack, skipping any code between the top of the function and the original call position [2]. This commit modifies the Wasm TB modules to support Fiber coroutines. Assuming the TCG CPU loop is executed by only one coroutine per thread, a TB module must allow helper functions to unwind and be resumed via rewinding. Specifically: - When a helper returns due to an unwind, the module must immediately return to its caller, allowing unwinding to propagate. - When being called again for a rewind, the module must skip any code between the top of the function and the call position that triggered the unwind, and directly enter the helper. To support this: - TBs now check the Asyncify.state JS object after each helper call. If unwinding is in progress, the TB immediately returns control to the caller. - Each function call is preceded by a block boundary and an update of the BLOCK_PTR_IDX variable. This enables the TB to re-enter execution at the correct point during a rewind, skipping earlier blocks. Additionally, this commit introduces wasmContext.do_init which is a flag indicating whether the TB should reset the BLOCK_PTR_IDX variable to 0 (i.e. start from the beginning). In call_wasm_tb, this is always set (ctx.do_init = 1) to ensure normal TB execution begins at the first block. Once the TB resets the BLOCK_PTR_IDX variable, it also clears do_init. During a rewind, the C code does not set ctx.do_init to 1, allowing the TB to preserve the BLOCK_PTR_IDX value from the previous unwind and correctly resume execution from the last unwound block. [1] https://emscripten.org/docs/api_reference/fiber.h.html [2] https://kripken.github.io/blog/wasm/2019/07/16/asyncify.html#new-asyncify Signed-off-by: Kohei Tokunaga <[email protected]>
This commit enables instantiations of TBs in wasm32.c. Browsers cause out of memory error if too many Wasm instances are created so the number of instances needs to be limited. So this commit restricts instantiation only for TBs that are called many times. This commit adds a counter (or its array if there are multiple threads) to the TB. Each time a TB is executed on TCI, the counter on TB is incremented. If it reaches to a threshold, that TB is instantiated as Wasm via instantiate_wasm. The total number of instances are tracked by the instances_global variable and its max number is limited by MAX_INSTANCES. When a Wasm module is instantiated, instances_global is incremented and the instance's function pointer is recorded to an array of wasmInstanceInfo. Each TB refers to the wasmInstanceInfo via wasmTBHeader's info_ptr (or its array if there are multiple threads). This allows tcg_qemu_tb_exec to resolve the instance function pointer from TB. When a new instantiation risks exceeding the limit, the Wasm backend doesn't perform the instantiation (i.e. TB continues to be executed on TCI), instead, removal of older Wasm instances is triggered using Emscripten's removeFunction function. Once the removal of the instance is detected via FinalizationRegistry API[1], instances_global is decremented, which allows instantiation of new modules again. [1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/FinalizationRegistry Signed-off-by: Kohei Tokunaga <[email protected]>
This commit enables qemu_ld and qemu_st to perform TLB lookups, following the approach used in other backends such as RISC-V. Unlike other backends, the Wasm backend cannot use ldst labels, as jumping to specific code addresses (e.g. raddr) is not possible in Wasm. Instead, each TLB lookup is followed by a if branch: if the lookup succeeds, the memory is accessed directly; otherwise, a fallback helper function is invoked. Support for MO_BSWAP is not yet implemented, so has_memory_bswap is set to false. Signed-off-by: Kohei Tokunaga <[email protected]>
Emscripten uses the optimization flag at link time to enable optimizations via Binaryen [1]. While meson.build currently recognizes the -Doptimization option, it does not propagate it to the linking. This commit updates meson.build to propagate the optimization flag to the linking when targeting WebAssembly. [1] https://emscripten.org/docs/optimizing/Optimizing-Code.html#how-emscripten-optimizes Signed-off-by: Kohei Tokunaga <[email protected]>
Check if wasm backend can be built in CI. Signed-off-by: Kohei Tokunaga <[email protected]>
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.