Skip to content

Conversation

@edolstra
Copy link
Collaborator

@edolstra edolstra commented Jan 5, 2026

Motivation

This PR adds support for extending Nix using WASM (specifically wasmtime) in two ways:

  • It adds a new builtin function, builtins.wasm, that allows Nix functions to be written in any language that compiles to WASM. For example,

    builtins.wasm ./nix_wasm_plugin_fib.wasm "fib" 40

    The goal is to allow complex functionality (like YAML parsers) to be written in more convenient languages (like Rust) and executed faster, without having to resort to import-from-derivation.

    The WASM function can lazily traverse its argument and can produce arbitrary Nix values (except functions). Each WASM function call is executed as a fresh instance of the WASM VM, ensuring determinism (i.e. you can't use the WASM memory to pass state between function calls).

    See Initial bindings/examples for writing Nix functions in Rust nix-wasm-rust#1 for examples of Nix functions written in Rust.

  • It adds a new system type, wasm32-wasip1, that allows for system-independent derivations. For example:

    derivation {
      name = "wasi-test";
      system = "wasm32-wasip1";
      builder = ./nix-wasi-builder-hello.wasm;
      args = [ "World!" ];
    }

    Example here: https://github.com/DeterminateSystems/nix-wasi-rust

Context

Summary by CodeRabbit

  • New Features

    • WASM support in nix expressions with a new primitive to call WASM functions.
    • WASI-based derivations enabling local builds that run WASM builders.
  • Packaging

    • Added the Wasmtime runtime as a packaged dependency.
  • Build / Integration

    • Core components now link with Wasmtime to enable WASM execution.
  • Experimental Features

    • New flags: "wasm-builtin" and "wasm-derivations" to enable WASM capabilities.

✏️ Tip: You can customize this high-level summary in your review settings.

It adds ~400 MB to the Nix closure which is a lot.
Also, require WASM plugins to export a `nix_wasm_init_v1` function.
We don't want to reuse instances between WASM calls, since that's
impure: the WASM function could use some mutable global state to
return a different value every time. So instead, pre-instantiate
instances, allowing them to be instantiated efficiently.

Unfortunately, while this is much faster than not using
pre-instantiation, it is slower than reusing an instance: about 1.3s
compared to 0.7s for the call.nix test in nix-wasm-plugin-test (which
performs 100K WASM calls).
@edolstra edolstra marked this pull request as draft January 5, 2026 13:03
@coderabbitai
Copy link

coderabbitai bot commented Jan 5, 2026

📝 Walkthrough

Walkthrough

Adds Wasmtime packaging and linking, implements a Wasmtime-backed wasm primop and a WASI derivation builder for wasm32-wasip1, updates build inputs/linker flags and path-realization API, and introduces experimental flags for wasm builtins and wasm-targeted derivations.

Changes

Cohort / File(s) Summary
Packaging & Dependencies
packaging/dependencies.nix, packaging/wasmtime.nix
Add wasmtime package (wasmtime v40.0.2) and expose it in project dependencies.
Expression Language — WASM Primop & Sources
src/libexpr/primops/wasm.cc, src/libexpr/primops/meson.build
New Wasmtime-backed WASM primop implementation; add wasm.cc to primops sources and register the wasm primop.
Expression Language — Build & API
src/libexpr/meson.build, src/libexpr/package.nix, src/libexpr/primops.cc
Link -lwasmtime into nixexpr, add wasmtime to buildInputs, and change realisePath from static to external linkage.
Store — WASI Derivation Builder
src/libstore/unix/build/wasi-derivation-builder.cc, src/libstore/unix/build/derivation-builder.cc
Add WasiDerivationBuilder implementing execBuilder via Wasmtime/WASI; instantiate it for wasm32-wasip1; rename realPathInSandboxrealPathInHost and update call sites.
Store — Build Config & Platform Support
src/libstore/meson.build, src/libstore/derivation-options.cc, src/libstore/package.nix, src/libstore/unix/build/chroot-derivation-builder.cc
Link nixstore with -lwasmtime, add wasmtime to build inputs, allow wasm32-wasip1 in local builds, and update chroot builder method name to realPathInHost.
Experimental Features
src/libutil/experimental-features.cc, src/libutil/include/nix/util/experimental-features.hh
Add wasm-builtin and wasm-derivations experimental flags and enum entries.

Sequence Diagram(s)

sequenceDiagram
    participant User as Nix User
    participant Eval as EvalState
    participant Cache as InstanceCache
    participant Engine as Wasmtime Engine/Linker
    participant Module as WASM Module
    participant Instance as WASM Instance
    participant Memory as WASM Memory
    participant Host as Host Functions

    User->>Eval: prim_wasm(path, entry, arg)
    Eval->>Cache: lookup(path)
    alt cached
        Cache-->>Eval: pre-instantiated data
    else not cached
        Eval->>Engine: create Engine & Linker
        Engine->>Host: register host functions
        Eval->>Module: load module from path
        Engine->>Module: compile & instantiate with Linker
        Module->>Instance: create instance & memory
        Cache-->>Eval: store pre-instantiation
    end
    Eval->>Instance: locate entry function
    Eval->>Memory: write arg into memory
    Eval->>Instance: call entry(ptr)
    Instance->>Host: call host functions as needed
    Instance-->>Eval: return result ptr
    Eval->>Memory: read result
    Eval-->>User: return Nix Value
Loading
sequenceDiagram
    participant Store as Nix Store
    participant Dispatcher as Builder Dispatcher
    participant WasiBuilder as WASI Derivation Builder
    participant Wasmtime as Wasmtime Engine
    participant WasmModule as WASM Module
    participant Instance as WASM Instance
    participant WasiEnv as WASI Environment

    Store->>Dispatcher: makeDerivationBuilder(drv: wasm32-wasip1)
    Dispatcher->>WasiBuilder: instantiate WASI builder
    WasiBuilder->>Wasmtime: new Engine & Linker
    WasiBuilder->>WasiEnv: configure WASI (stdio, argv, env, preopens)
    WasiBuilder->>WasmModule: load builder wasm path
    Wasmtime->>Instance: instantiate module
    WasiBuilder->>Instance: locate _start export
    WasiBuilder->>Instance: invoke _start()
    Instance->>WasiEnv: perform WASI syscalls
    Instance-->>WasiBuilder: execution complete
    WasiBuilder-->>Store: build finished
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

flake-regression-test

Suggested reviewers

  • cole-h

Poem

🐰 I hopped in with Wasmtime bright,

Modules wake and take their flight,
WASI builders drum and run,
Nix and wasm now work as one,
Hooray — more hops, more fun! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.85% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'WASM support' directly and clearly describes the main objective of the PR, which is to add WebAssembly support to Nix with wasmtime integration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Fix all issues with AI Agents 🤖
In @src/libexpr/primops/wasm.cc:
- Around line 255-267: The code reads ValueId uint32_t values directly from WASM
memory in make_list, copy_list, make_attrset, and copy_attrset, assuming host
endianness; fix this by converting each 32-bit word from little-endian to host
order when constructing ValueId (e.g., apply le32toh or an equivalent portable
little-endian-to-host conversion) for every read from the memory subspan (the
vs[] accesses) so ValueId values are correct on big-endian hosts; update all
occurrences in the listed functions to perform this conversion before indexing
into values or assigning into lists/attrsets.
- Around line 334-350: In copy_attrname, replace the compile-time asserts with
runtime validations: check that attrIdx < attrs.size() and that len ==
name.size(); if either check fails, raise a runtime error (e.g., throw
std::runtime_error with a clear message mentioning valueId, attrIdx, len and
name.size()) or otherwise signal a trap/error path instead of proceeding; only
perform memcpy when both checks pass and keep the rest of the logic
(state.forceAttrs, obtaining name from state.symbols, and writing into memory())
unchanged.
- Around line 433-439: The static variable name `primop_fromTOML` is misleading
for the WASM primop registration; rename the variable to a descriptive
identifier like `primop_wasm` (i.e., change `static RegisterPrimOp
primop_fromTOML(...` to `static RegisterPrimOp primop_wasm(...`) so it matches
the registered `.name = "wasm"` and `.fun = prim_wasm`; update any local
references to this static variable if present in the file to use the new name.
- Around line 309-332: The copy_attrset function writes attributes into WASM
memory using enumerate(*value.attrs()) which is non-deterministic; to fix,
gather the attributes into a temporary vector (e.g., vector of pairs or pointers
to attr), sort that vector lexicographically by the attribute name string via
state.symbols[attr.name], then iterate the sorted vector to populate
buf[n].value = addValue(attr.value) and buf[n].nameLen =
state.symbols[attr.name].size(), preserving the existing buffer handling
(subspan/Attr/maxLen) and return value.attrs()->size().
- Around line 411-418: The static unordered_map instancesPre is not thread-safe
and leaks entries; replace it with a thread-safe concurrent map (use
boost::concurrent_flat_map keyed by SourcePath) and store weak references
instead of owning refs so entries can be reclaimed (e.g., map value -> weak
pointer to NixWasmInstancePre or equivalent weak_ref). Update the lookup/insert
logic around instancesPre and the use of make_ref<NixWasmInstancePre>(wasmPath)
so you first attempt to lock the weak entry, create a new instance only if
expired, and store a weak reference in the concurrent_flat_map; optionally add
an eviction or LRU policy if unbounded growth remains a concern.
- Around line 49-53: The subspan function currently uses assert to check buffer
size which is omitted in release builds; replace the assert in
subspan(std::span<uint8_t> s, size_t len) with a runtime check that throws a
descriptive exception (e.g., std::out_of_range or std::length_error) when
s.size() < len * sizeof(T), include the required header ( <stdexcept> ), and
keep the returned span construction (use reinterpret_cast<T*>(s.data()) for
clarity) so callers get a safe failure instead of potential buffer overruns.
🧹 Nitpick comments (12)
src/libstore/derivation-options.cc (1)

399-400: Consider requiring a system feature for WASM support.

The special-casing of "wasm32-wasip1" allows any system to build WASM derivations, which aligns with the PR's goal of system-independent derivations. However, this assumes Wasmtime is always available.

Consider whether wasm32-wasip1 builds should require a system feature (e.g., "wasm" or "wasmtime") to ensure the runtime is actually available on the build machine. This would be checked by the existing getRequiredSystemFeatures logic at lines 406-408.

🔎 Alternative approach using system features

If you want to gate WASM builds on explicit system capability:

  1. Have wasm32-wasip1 derivations declare requiredSystemFeatures = ["wasm"];
  2. Remove the special case here, letting the normal system feature check at lines 406-408 handle it:
-    if (drv.platform != settings.thisSystem.get() && drv.platform != "wasm32-wasip1"
-        && !settings.extraPlatforms.get().count(drv.platform) && !drv.isBuiltin())
+    if (drv.platform != settings.thisSystem.get()
+        && !settings.extraPlatforms.get().count(drv.platform) && !drv.isBuiltin())
         return false;

This would make WASM support more explicit and consistent with how other cross-compilation scenarios are handled.

src/libstore/meson.build (1)

383-383: Consider using Meson's dependency system instead of hardcoded linker flags.

Directly appending -lwasmtime to link_args bypasses Meson's dependency resolution, which can lead to portability and configuration issues:

  • No verification that the wasmtime library is actually available
  • Library search paths may not be configured correctly
  • Platform-specific library naming conventions (e.g., .dll vs .so) are not handled
  • Build will fail with cryptic linker errors rather than clear dependency errors
🔎 Recommended approach using Meson dependency system

Near the top of the file with other dependencies (around line 109-161), add:

+wasmtime = dependency('wasmtime', required : true)
+deps_private += wasmtime

Then remove the hardcoded linker flag:

-  link_args : linker_export_flags + [ '-lwasmtime' ],
+  link_args : linker_export_flags,

This provides better error messages, respects pkg-config paths, and handles platform differences automatically.

src/libexpr/meson.build (1)

242-242: Consider using Meson's dependency system instead of hardcoded linker flags.

This has the same portability concerns as the identical change in src/libstore/meson.build at line 383. Hardcoded -lwasmtime bypasses Meson's dependency resolution and can cause issues with library paths, platform-specific naming, and error reporting.

🔎 Recommended approach using Meson dependency system

Near the dependencies section (around lines 41-92), add:

+wasmtime = dependency('wasmtime', required : true)
+deps_private += wasmtime

Then remove the hardcoded linker flag:

-  link_args : linker_export_flags + [ '-lwasmtime' ],
+  link_args : linker_export_flags,

This approach provides better error messages, respects pkg-config configurations, and handles cross-platform differences automatically.

src/libstore/unix/build/derivation-builder.cc (1)

2033-2035: Early return for wasm32-wasip1 bypasses later sandbox / uid‑range checks

Routing params.drv.platform == "wasm32-wasip1" directly to WasiDerivationBuilder happens before:

  • The diverted store check (storeDir != realStoreDir) and
  • The !useSandbox && params.drvOptions.useUidRange(drv) guard.

That means for WASI builds:

  • uid-range requests will no longer be rejected when sandboxing is effectively disabled here.
  • Diverted stores on non‑Linux platforms will also be allowed for this path, unlike other builders.

If this is intentional (relying on the WASI runtime rather than OS sandboxing), consider making that explicit and either:

  • Reject uid-range in WasiDerivationBuilder’s path (or in this branch) so configuration errors are surfaced, or
  • Document that uid-range is ignored for wasm32-wasip1 and that OS sandboxing semantics differ from other systems.
src/libexpr/primops.cc (1)

168-188: Making realisePath non‑static for reuse is reasonable

Exposing SourcePath realisePath(...) (no longer static) with the optional SymlinkResolution parameter allows other primop modules (e.g. WASM/WASI ones) to share the same path + context resolution logic. The implementation is consistent with existing callers, and the default (Full) preserves previous behavior.

Just ensure there is a matching declaration in a header (e.g. primops.hh) so external uses stay type‑checked and the function remains the single canonical implementation.

src/libstore/unix/build/wasi-derivation-builder.cc (1)

5-18: Deduplicate unwrap / string2span helpers and tighten const‑correctness

Both unwrap and string2span appear to be cut‑and‑paste from src/libexpr/primops/wasm.cc, which means future changes to the Wasmtime API or error handling will have to be done in two places.

Consider:

  • Moving these helpers to a small shared header (e.g. a wasm-runtime-utils.hh) used by both primops and the WASI builder, and
  • Switching string2span to use a std::span<const uint8_t> (and avoiding the C‑style cast) if the Wasmtime C++ API accepts a const span, to keep const‑correctness and avoid accidental mutation.

Please double‑check the current wasmtime::Module::compile signature in your pinned Wasmtime version to confirm whether std::span<const uint8_t> is acceptable and adjust the helper accordingly.

src/libexpr/primops/wasm.cc (6)

6-6: Consider removing using namespace wasmtime; for clarity.

While acceptable in implementation files, explicitly qualifying wasmtime types improves code clarity and prevents potential name collisions with future additions to the nix namespace.


10-15: Refactor: Remove duplicate declaration and include proper header.

The realisePath function is already defined in src/libexpr/primops.cc (lines 167-187). Instead of forward-declaring it here, consider extracting it to a shared header file that both files can include, or making it part of the EvalState API.


94-103: Consider factory method for clearer error handling.

The constructor performs heavy operations (WASM compilation via unwrap()) that can throw. Consider using a static factory method that returns Result<ref<NixWasmInstancePre>, Error> for clearer error handling and easier testing.


182-196: Simplify error throwing pattern.

Line 196 uses an immediately-invoked lambda to throw an error, which is less clear than throwing directly. The ternary chain could also benefit from a switch statement for clarity.

🔎 Proposed refactor
 uint32_t get_type(ValueId valueId)
 {
     auto & value = *values.at(valueId);
     state.forceValue(value, noPos);
-    auto t = value.type();
-    return t == nInt        ? 1
-           : t == nFloat    ? 2
-           : t == nBool     ? 3
-           : t == nString   ? 4
-           : t == nPath     ? 5
-           : t == nNull     ? 6
-           : t == nAttrs    ? 7
-           : t == nList     ? 8
-           : t == nFunction ? 9
-                            : []() -> int { throw Error("unsupported type"); }();
+    switch (value.type()) {
+        case nInt: return 1;
+        case nFloat: return 2;
+        case nBool: return 3;
+        case nString: return 4;
+        case nPath: return 5;
+        case nNull: return 6;
+        case nAttrs: return 7;
+        case nList: return 8;
+        case nFunction: return 9;
+        default: throw Error("unsupported type");
+    }
 }

230-238: Consider adding documentation for two-phase copy pattern.

This function uses a common but potentially confusing pattern: it only copies data if the buffer is large enough, but always returns the actual size. A brief comment explaining that callers should check returnValue <= maxLen before using the buffer would improve clarity.


426-426: Improve readability by splitting complex expression.

Line 426 chains multiple operations in a single expression, making it difficult to read and debug. Consider breaking it into separate steps.

🔎 Proposed refactor
-        v = *instance.values.at(instance.runFunction(functionName, {(int32_t) instance.addValue(args[2])}).at(0).i32());
+        auto argId = instance.addValue(args[2]);
+        auto results = instance.runFunction(functionName, {(int32_t) argId});
+        auto resultId = results.at(0).i32();
+        v = *instance.values.at(resultId);
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 51dacdd and 1eaba77.

📒 Files selected for processing (11)
  • packaging/dependencies.nix
  • packaging/wasmtime.nix
  • src/libexpr/meson.build
  • src/libexpr/package.nix
  • src/libexpr/primops.cc
  • src/libexpr/primops/meson.build
  • src/libexpr/primops/wasm.cc
  • src/libstore/derivation-options.cc
  • src/libstore/meson.build
  • src/libstore/unix/build/derivation-builder.cc
  • src/libstore/unix/build/wasi-derivation-builder.cc
🧰 Additional context used
🧬 Code graph analysis (3)
src/libstore/derivation-options.cc (1)
src/libstore/include/nix/store/derivation-options.hh (2)
  • drv (184-184)
  • drv (201-201)
src/libstore/unix/build/wasi-derivation-builder.cc (1)
src/libexpr/primops/wasm.cc (7)
  • unwrap (20-25)
  • unwrap (20-20)
  • v (131-136)
  • v (131-131)
  • v (138-143)
  • string2span (38-41)
  • string2span (38-38)
src/libexpr/primops/wasm.cc (1)
src/libexpr/primops.cc (4)
  • state (3218-3224)
  • state (3218-3218)
  • realisePath (168-188)
  • realisePath (168-172)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (6)
src/libexpr/primops/meson.build (1)

12-12: LGTM!

Adding wasm.cc to the primops sources is straightforward and correct.

src/libexpr/package.nix (1)

17-17: LGTM!

The wasmtime dependency is correctly added following the same pattern as toml11 and other build-time dependencies.

Also applies to: 68-68

packaging/dependencies.nix (1)

78-79: Wiring wasmtime into the dependency scope looks correct

wasmtime = pkgs.callPackage ./wasmtime.nix { }; is consistent with how other custom packages are exposed here; no further changes needed.

src/libstore/unix/build/derivation-builder.cc (2)

294-298: Naming comment on realPathInSandbox

The // FIXME: misnomer note correctly reflects that this helper currently just resolves store paths rather than honoring a sandbox path mapping. No functional impact, so this can remain as a low‑priority cleanup marker.


1996-2001: Including wasi-derivation-builder.cc alongside other platform builders is consistent

Adding #include "wasi-derivation-builder.cc" next to the other platform-specific builder includes matches the existing “single TU with included .cc files” pattern. No issues from an integration standpoint.

packaging/wasmtime.nix (1)

1-121: Wasmtime packaging looks solid; minor note on header-copy path robustness

Overall this derivation is well-structured:

  • Uses a pinned Rust toolchain (rust_1_89) and buildRustPackage.
  • Restricts doCheck to platforms with the required SIMD features and skips problematic aarch64 runs.
  • Splits outputs into out, dev, and lib, pruning .so / .a according to enableShared / enableStatic.
  • Handles shell completions correctly for both native and cross builds.
  • Adds an install-time --version check and an updateScript with an appropriate version regex.

One thing to keep in mind is the header copy:

cp -r target/${cargoShortTarget}/release/build/wasmtime-c-api-impl-*/out/include $dev/include

This relies on the exact crate/build directory naming and release profile path; if upstream layout or build profile changes, this will break somewhat opaquely. It may be worth:

  • Asserting that exactly one wasmtime-c-api-impl-* match exists, or
  • Failing with a clearer message if the glob expands to zero or multiple directories.

Otherwise the packaging is in good shape.

Please double-check against your pinned Wasmtime 40.0.0 sources that the wasmtime-c-api-impl-* build output layout (and the cargoShortTarget/release path) is stable; if upstream changes it, the cp -r .../out/include step will be the first thing to break.

Comment on lines +49 to +53
static std::span<T> subspan(std::span<uint8_t> s, size_t len)
{
assert(s.size() >= len * sizeof(T));
return std::span((T *) s.data(), len);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Replace assert with proper error handling.

The assert on line 51 is compiled out in release builds, which could lead to buffer overruns if the size check fails. Use a runtime check that throws an exception instead.

🔎 Proposed fix
 template<typename T>
 static std::span<T> subspan(std::span<uint8_t> s, size_t len)
 {
-    assert(s.size() >= len * sizeof(T));
+    if (s.size() < len * sizeof(T))
+        throw Error("buffer too small: expected at least %d bytes, got %d", len * sizeof(T), s.size());
     return std::span((T *) s.data(), len);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
static std::span<T> subspan(std::span<uint8_t> s, size_t len)
{
assert(s.size() >= len * sizeof(T));
return std::span((T *) s.data(), len);
}
template<typename T>
static std::span<T> subspan(std::span<uint8_t> s, size_t len)
{
if (s.size() < len * sizeof(T))
throw Error("buffer too small: expected at least %d bytes, got %d", len * sizeof(T), s.size());
return std::span((T *) s.data(), len);
}
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 49-53, The subspan function
currently uses assert to check buffer size which is omitted in release builds;
replace the assert in subspan(std::span<uint8_t> s, size_t len) with a runtime
check that throws a descriptive exception (e.g., std::out_of_range or
std::length_error) when s.size() < len * sizeof(T), include the required header
( <stdexcept> ), and keep the returned span construction (use
reinterpret_cast<T*>(s.data()) for clarity) so callers get a safe failure
instead of potential buffer overruns.

Comment on lines +255 to +267
ValueId make_list(uint32_t ptr, uint32_t len)
{
auto vs = subspan<ValueId>(memory().subspan(ptr), len);

auto [valueId, value] = allocValue();

auto list = state.buildList(len);
for (const auto & [n, v] : enumerate(list))
v = values.at(vs[n]); // FIXME: endianness
value.mkList(list);

return valueId;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Address endianness assumption for cross-platform correctness.

The FIXME on line 263 notes that ValueIds are read from WASM memory without endianness conversion. WASM uses little-endian, so this code will fail on big-endian host systems. While big-endian systems are rare, this violates Nix's goal of deterministic, cross-platform builds.

The same issue appears in copy_list (line 278), make_attrset (line 296), and copy_attrset (line 322).

Consider using explicit little-endian conversion functions (e.g., le32toh) when reading multi-byte values from WASM memory.

🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 255-267, The code reads ValueId
uint32_t values directly from WASM memory in make_list, copy_list, make_attrset,
and copy_attrset, assuming host endianness; fix this by converting each 32-bit
word from little-endian to host order when constructing ValueId (e.g., apply
le32toh or an equivalent portable little-endian-to-host conversion) for every
read from the memory subspan (the vs[] accesses) so ValueId values are correct
on big-endian hosts; update all occurrences in the listed functions to perform
this conversion before indexing into values or assigning into lists/attrsets.

Comment on lines +309 to +332
uint32_t copy_attrset(ValueId valueId, uint32_t ptr, uint32_t maxLen)
{
auto & value = *values.at(valueId);
state.forceAttrs(value, noPos, "while copying an attrset into WASM");

if (value.attrs()->size() <= maxLen) {
// FIXME: endianness.
struct Attr
{
ValueId value;
uint32_t nameLen;
};

auto buf = subspan<Attr>(memory().subspan(ptr), maxLen);

// FIXME: for determinism, we should return attributes in lexicographically sorted order.
for (const auto & [n, attr] : enumerate(*value.attrs())) {
buf[n].value = addValue(attr.value);
buf[n].nameLen = state.symbols[attr.name].size();
}
}

return value.attrs()->size();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix: Ensure deterministic attribute ordering.

The FIXME on line 324 correctly identifies that attributes should be returned in lexicographically sorted order for determinism. Nix's evaluation model requires consistent attribute ordering across evaluations.

🔎 Proposed fix
         if (value.attrs()->size() <= maxLen) {
             // FIXME: endianness.
             struct Attr
             {
                 ValueId value;
                 uint32_t nameLen;
             };
 
             auto buf = subspan<Attr>(memory().subspan(ptr), maxLen);
 
-            // FIXME: for determinism, we should return attributes in lexicographically sorted order.
-            for (const auto & [n, attr] : enumerate(*value.attrs())) {
+            // Collect and sort attributes by name for determinism
+            std::vector<std::pair<std::string_view, const Attr*>> sortedAttrs;
+            for (const auto & attr : *value.attrs()) {
+                sortedAttrs.push_back({state.symbols[attr.name], &attr});
+            }
+            std::sort(sortedAttrs.begin(), sortedAttrs.end(),
+                [](const auto & a, const auto & b) { return a.first < b.first; });
+            
+            for (const auto & [n, pair] : enumerate(sortedAttrs)) {
+                const auto & attr = *pair.second;
                 buf[n].value = addValue(attr.value);
                 buf[n].nameLen = state.symbols[attr.name].size();
             }
         }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 309-332, The copy_attrset function
writes attributes into WASM memory using enumerate(*value.attrs()) which is
non-deterministic; to fix, gather the attributes into a temporary vector (e.g.,
vector of pairs or pointers to attr), sort that vector lexicographically by the
attribute name string via state.symbols[attr.name], then iterate the sorted
vector to populate buf[n].value = addValue(attr.value) and buf[n].nameLen =
state.symbols[attr.name].size(), preserving the existing buffer handling
(subspan/Attr/maxLen) and return value.attrs()->size().

Comment on lines +334 to +350
std::monostate copy_attrname(ValueId valueId, uint32_t attrIdx, uint32_t ptr, uint32_t len)
{
auto & value = *values.at(valueId);
state.forceAttrs(value, noPos, "while copying an attr name into WASM");

auto & attrs = *value.attrs();

assert((size_t) attrIdx < attrs.size());

std::string_view name = state.symbols[attrs[attrIdx].name];

assert((size_t) len == name.size());

memcpy(memory().subspan(ptr, len).data(), name.data(), name.size());

return {};
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Replace assertions with runtime checks.

Lines 341 and 345 use assert statements that are compiled out in release builds. This can lead to:

  • Line 341: Out-of-bounds array access if attrIdx >= attrs.size()
  • Line 345: Buffer overflow if len != name.size()
🔎 Proposed fix
 std::monostate copy_attrname(ValueId valueId, uint32_t attrIdx, uint32_t ptr, uint32_t len)
 {
     auto & value = *values.at(valueId);
     state.forceAttrs(value, noPos, "while copying an attr name into WASM");
 
     auto & attrs = *value.attrs();
 
-    assert((size_t) attrIdx < attrs.size());
+    if ((size_t) attrIdx >= attrs.size())
+        throw Error("attribute index %d out of bounds (size: %d)", attrIdx, attrs.size());
 
     std::string_view name = state.symbols[attrs[attrIdx].name];
 
-    assert((size_t) len == name.size());
+    if ((size_t) len != name.size())
+        throw Error("buffer size mismatch: expected %d bytes, got %d", name.size(), len);
 
     memcpy(memory().subspan(ptr, len).data(), name.data(), name.size());
 
     return {};
 }
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 334-350, In copy_attrname, replace
the compile-time asserts with runtime validations: check that attrIdx <
attrs.size() and that len == name.size(); if either check fails, raise a runtime
error (e.g., throw std::runtime_error with a clear message mentioning valueId,
attrIdx, len and name.size()) or otherwise signal a trap/error path instead of
proceeding; only perform memcpy when both checks pass and keep the rest of the
logic (state.forceAttrs, obtaining name from state.symbols, and writing into
memory()) unchanged.

Comment on lines +411 to +418
// FIXME: make thread-safe.
// FIXME: make this a weak Boehm GC pointer so that it can be freed during GC.
static std::unordered_map<SourcePath, ref<NixWasmInstancePre>> instancesPre;

auto instancePre = instancesPre.find(wasmPath);
if (instancePre == instancesPre.end())
instancePre = instancesPre.emplace(wasmPath, make_ref<NixWasmInstancePre>(wasmPath)).first;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if EvalState or prim_* functions are used in concurrent contexts
rg -n "std::thread|std::async|pthread" --type=cpp -C 2 | head -50

# Check for existing thread-safety patterns in primops
rg -n "std::mutex|std::lock_guard|std::atomic" src/libexpr/ --type=cpp -C 2 | head -50

Repository: DeterminateSystems/nix-src

Length of output: 5273


🏁 Script executed:

#!/bin/bash
# Look at the actual file context around lines 411-418
cat -n src/libexpr/primops/wasm.cc | sed -n '400,430p'

Repository: DeterminateSystems/nix-src

Length of output: 1640


🏁 Script executed:

#!/bin/bash
# Check how EvalState handles concurrent evaluation
rg -n "EvalState.*thread|thread.*EvalState" --type=cpp -C 3 | head -40

# Look for parallel evaluation support
rg -n "parallel|concurrent|thread-safe" src/libexpr/ --type=cpp | grep -i "eval\|prim" | head -20

Repository: DeterminateSystems/nix-src

Length of output: 2211


Fix thread-unsafe static cache in prim_wasm.

The static instancesPre map on line 413 is not thread-safe and will race under parallel evaluation. Nix supports parallel evaluation (see builtins.parallel and parallel-eval.cc), making this a critical issue.

The cache grows unbounded and is never freed. Use boost::concurrent_flat_map (already used elsewhere in EvalState for srcToStore and importResolutionCache) and consider weak pointers or LRU eviction to prevent memory bloat.

🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 411-418, The static unordered_map
instancesPre is not thread-safe and leaks entries; replace it with a thread-safe
concurrent map (use boost::concurrent_flat_map keyed by SourcePath) and store
weak references instead of owning refs so entries can be reclaimed (e.g., map
value -> weak pointer to NixWasmInstancePre or equivalent weak_ref). Update the
lookup/insert logic around instancesPre and the use of
make_ref<NixWasmInstancePre>(wasmPath) so you first attempt to lock the weak
entry, create a new instance only if expired, and store a weak reference in the
concurrent_flat_map; optionally add an eviction or LRU policy if unbounded
growth remains a concern.

Comment on lines 433 to 439
static RegisterPrimOp primop_fromTOML(
{.name = "wasm",
.args = {"wasm", "entry", "arg"},
.doc = R"(
Call a WASM function with the specified argument.
)",
.fun = prim_wasm});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix misleading variable name.

The variable is named primop_fromTOML but actually registers the wasm primop. This appears to be a copy-paste error from another primop definition.

🔎 Proposed fix
-static RegisterPrimOp primop_fromTOML(
+static RegisterPrimOp primop_wasm(
     {.name = "wasm",
      .args = {"wasm", "entry", "arg"},
-     .doc = R"(
-      Call a WASM function with the specified argument.
-     )",
+     .doc = R"(
+       Call a WASM function from a WebAssembly module.
+       
+       Arguments:
+       - wasm: Path to the .wasm file
+       - entry: Name of the exported function to call
+       - arg: Argument to pass to the function (Nix value)
+       
+       Returns the result from the WASM function as a Nix value.
+       The WASM module must export a `nix_wasm_init_v1` function.
+     )",
      .fun = prim_wasm});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
static RegisterPrimOp primop_fromTOML(
{.name = "wasm",
.args = {"wasm", "entry", "arg"},
.doc = R"(
Call a WASM function with the specified argument.
)",
.fun = prim_wasm});
static RegisterPrimOp primop_wasm(
{.name = "wasm",
.args = {"wasm", "entry", "arg"},
.doc = R"(
Call a WASM function from a WebAssembly module.
Arguments:
- wasm: Path to the .wasm file
- entry: Name of the exported function to call
- arg: Argument to pass to the function (Nix value)
Returns the result from the WASM function as a Nix value.
The WASM module must export a `nix_wasm_init_v1` function.
)",
.fun = prim_wasm});
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 433-439, The static variable name
`primop_fromTOML` is misleading for the WASM primop registration; rename the
variable to a descriptive identifier like `primop_wasm` (i.e., change `static
RegisterPrimOp primop_fromTOML(...` to `static RegisterPrimOp primop_wasm(...`)
so it matches the registered `.name = "wasm"` and `.fun = prim_wasm`; update any
local references to this static variable if present in the file to use the new
name.

Comment on lines 29 to 72
void execBuilder(const Strings & args, const Strings & envStrs) override
{
using namespace wasmtime;

Engine engine;
Linker linker(engine);
unwrap(linker.define_wasi());

WasiConfig wasiConfig;
wasiConfig.inherit_stdin();
wasiConfig.inherit_stdout();
wasiConfig.inherit_stderr();
wasiConfig.argv(std::vector(args.begin(), args.end()));
{
std::vector<std::pair<std::string, std::string>> env2;
for (auto & [k, v] : env)
env2.emplace_back(k, rewriteStrings(v, inputRewrites));
wasiConfig.env(env2);
}
if (!wasiConfig.preopen_dir(
store.config->realStoreDir.get(),
store.storeDir,
WASMTIME_WASI_DIR_PERMS_READ | WASMTIME_WASI_DIR_PERMS_WRITE,
WASMTIME_WASI_FILE_PERMS_READ | WASMTIME_WASI_FILE_PERMS_WRITE))
throw Error("cannot add store directory to WASI config");
// FIXME: add temp dir

auto module = unwrap(Module::compile(engine, string2span(readFile(realPathInSandbox(drv.builder)))));
wasmtime::Store wasmStore(engine);
unwrap(wasmStore.context().set_wasi(std::move(wasiConfig)));
auto instance = unwrap(linker.instantiate(wasmStore, module));

auto startName = "_start";
auto ext = instance.get(wasmStore, startName);
if (!ext)
throw Error("WASM module '%s' does not export function '%s'", drv.builder, startName);
auto fun = std::get_if<Func>(&*ext);
if (!fun)
throw Error("export '%s' of WASM module '%s' is not a function", startName, drv.builder);

unwrap(fun->call(wasmStore.context(), {}));

_exit(0);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd src/libstore/unix/build && wc -l wasi-derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 99


🏁 Script executed:

cd src/libstore/unix/build && head -80 wasi-derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 2592


🏁 Script executed:

rg -n "class.*DerivationBuilder" src/libstore/unix/build/ -A 5

Repository: DeterminateSystems/nix-src

Length of output: 510


🏁 Script executed:

rg -n "initEnv" src/libstore/unix/build/ -A 10

Repository: DeterminateSystems/nix-src

Length of output: 2704


🏁 Script executed:

rg -n "runChild" src/libstore/unix/build/build-user.cc -A 15

Repository: DeterminateSystems/nix-src

Length of output: 147


🏁 Script executed:

rg -n "runChild" src/libstore/ -B 2 -A 15

Repository: DeterminateSystems/nix-src

Length of output: 9282


🏁 Script executed:

sed -n '1057,1150p' src/libstore/unix/build/derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 4230


🏁 Script executed:

rg -n "void.*execBuilder" src/libstore/ -B 5 -A 5

Repository: DeterminateSystems/nix-src

Length of output: 3761


🏁 Script executed:

sed -n '1276,1360p' src/libstore/unix/build/derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 2796


🏁 Script executed:

rg -n "tmpDirInSandbox" src/libstore/unix/build/ -A 3

Repository: DeterminateSystems/nix-src

Length of output: 4675


🏁 Script executed:

sed -n '1340,1390p' src/libstore/unix/build/derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 2017


🏁 Script executed:

rg -n "for.*env\|envStrs" src/libstore/unix/build/derivation-builder.cc -A 3 -B 2 | grep -A 5 "1340\|1350\|1360\|1370"

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

sed -n '1350,1380p' src/libstore/unix/build/derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 1016


Address unused envStrs parameter and duplicate env rewriting

The envStrs parameter is computed in runChild() by iterating over env and applying rewriteStrings() to each key=value pair, then passed to execBuilder(). The WASI override ignores this and rebuilds env2 using identical logic. Reuse envStrs by parsing it to avoid duplication and ensure consistency if env handling changes.

Preopen the temp directory and adjust store permissions

initEnv() sets TMPDIR, NIX_BUILD_TOP, and PWD to tmpDirInSandbox(), but only the store directory is preopened in the WASI config. WASM builders attempting to write to $TMPDIR or access $NIX_BUILD_TOP will fail. Additionally, the store is preopened with READ|WRITE permissions on all files; typical chroot builds expose stores read-only, with only outputs writable. Add a preopen for the temp directory and consider making the store read-only while exposing outputs or scratch locations separately to match existing sandbox assumptions.

@github-actions github-actions bot temporarily deployed to pull request January 5, 2026 15:17 Inactive
@github-actions
Copy link

github-actions bot commented Jan 5, 2026

This function was named incorrectly. It denotes the path in the host
filesystem, not in the sandbox.
@github-actions github-actions bot temporarily deployed to pull request January 5, 2026 17:20 Inactive
Even though it's in a separate output, disabling it reduces the size
of libwasmtime.so by around 12 MB.
This reduces the size of the Nix closure by another 12 MB.
@github-actions github-actions bot temporarily deployed to pull request January 6, 2026 14:08 Inactive
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a copyright notice that points back to nixpkgs, since this is a verbatim copy of the package definition from there.

@github-actions github-actions bot temporarily deployed to pull request January 19, 2026 09:12 Inactive
@github-actions github-actions bot temporarily deployed to pull request January 19, 2026 14:31 Inactive
@edolstra edolstra marked this pull request as ready for review January 19, 2026 16:02
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/libstore/package.nix (1)

64-69: Guard wasmtime by platform compatibility.

wasmtime is marked unix-only in its package metadata (platforms = lib.platforms.unix), but the library declares Windows support. Adding it unconditionally to buildInputs breaks evaluation on Windows. Use a platform guard like lib.optional stdenv.hostPlatform.isUnix.

Proposed fix
  buildInputs = [
    boost
    curl
    sqlite
-   wasmtime
  ]
+  ++ lib.optional stdenv.hostPlatform.isUnix wasmtime
   ++ lib.optional stdenv.hostPlatform.isLinux libseccomp
   ++ lib.optional withAWS aws-crt-cpp;
🤖 Fix all issues with AI agents
In `@src/libexpr/primops/wasm.cc`:
- Around line 369-379: The regFun host callback only catches Error exceptions,
so std::any_cast, container access, or other operations can throw
std::exception-derived or unknown exceptions and escape; update the lambda in
regFun (the callback created in func_wrap) to add catch(const std::exception& e)
that returns Trap(e.what()) and a final catch(...) that returns a generic Trap
with an explanatory message (e.g., "unknown exception in host callback") so all
exceptions from the Caller/context access and method invocation are converted to
Traps.
- Around line 120-127: The constructor NixWasmInstance currently assumes
instance.get(wasmCtx, "memory") yields a Memory and directly uses std::get,
which can throw std::bad_variant_access; replace this with the same safe pattern
used in getFunction(): call instance.get(wasmCtx, "memory"), check the optional
export exists, then use std::get_if<Memory> on the contained Export to validate
the type, and if missing or wrong type throw a clear runtime_error (or use
unwrap) with a descriptive message; set memory_ only after successful
validation.
- Around line 182-186: Add a bounds-check helper (e.g., validateValueId or
checkValueId) that checks the supplied ValueId against values.size() and throws
the codebase Error type on out-of-range IDs, then replace direct uses of
values.at(valueId) with a call to that helper followed by safe indexed access
(values[valueId]) in all WASM-exposed accessors: get_type, get_int, get_float,
copy_string, get_bool, make_list, copy_list, make_attrset, copy_attrset,
copy_attrname, and call_function so WASM-supplied IDs produce a uniform Error
exception instead of std::out_of_range.
- Around line 162-165: The code uses memory().subspan(ptr, len) without runtime
bounds checks, which allows OOB access when WASM passes untrusted ptr/len; add a
checked helper (e.g., a function check_subspan(ptr, len) that fetches
memory_.data(wasmCtx), verifies ptr + len <= memory.size() and throws or returns
an error/empty optional on violation) and replace all direct
memory().subspan(...) usages in the WASM entry points (panic(), warn(),
make_string(), make_list(), make_attrset(), copy_attrname(), call_function() and
any other places referencing memory().subspan) to call this helper before
accessing data so reads/writes are guarded and undefined behavior is prevented.
Ensure the helper uses the same wasmCtx/memory_ access path
(memory_.data(wasmCtx)) and consistently handles errors (e.g., return an
expected/optional or throw) so callers perform no unchecked subspan operations.
♻️ Duplicate comments (8)
src/libstore/unix/build/wasi-derivation-builder.cc (1)

29-72: Address unused envStrs parameter and missing temp directory preopen.

Two issues remain from previous review:

  1. Unused parameter: The envStrs parameter (computed by runChild() with rewriteStrings() applied) is ignored. Instead, lines 43-46 rebuild the environment by iterating over env and re-applying rewriteStrings(). This duplicates work and risks inconsistency if env handling changes upstream. Consider parsing envStrs directly.

  2. Missing temp directory: The FIXME at line 54 acknowledges that the temp directory is not preopened. Since initEnv() sets TMPDIR, NIX_BUILD_TOP, and PWD to tmpDirInSandbox(), WASM builders attempting to access these paths will fail.

Suggested fix for temp directory
         if (!wasiConfig.preopen_dir(
                 store.config->realStoreDir.get(),
                 store.storeDir,
                 WASMTIME_WASI_DIR_PERMS_READ | WASMTIME_WASI_DIR_PERMS_WRITE,
                 WASMTIME_WASI_FILE_PERMS_READ | WASMTIME_WASI_FILE_PERMS_WRITE))
             throw Error("cannot add store directory to WASI config");
-        // FIXME: add temp dir
+        if (!wasiConfig.preopen_dir(
+                tmpDir,
+                tmpDirInSandbox(),
+                WASMTIME_WASI_DIR_PERMS_READ | WASMTIME_WASI_DIR_PERMS_WRITE,
+                WASMTIME_WASI_FILE_PERMS_READ | WASMTIME_WASI_FILE_PERMS_WRITE))
+            throw Error("cannot add temp directory to WASI config");
src/libexpr/primops/wasm.cc (6)

48-52: Replace assert in subspan with a runtime check.

assert is compiled out in release builds, so out-of-bounds spans can slip through. Use a real check and throw a descriptive Error.

🔧 Proposed fix
 template<typename T>
 static std::span<T> subspan(std::span<uint8_t> s, size_t len)
 {
-    assert(s.size() >= len * sizeof(T));
-    return std::span((T *) s.data(), len);
+    if (s.size() < len * sizeof(T))
+        throw Error("buffer too small: expected at least %zu bytes, got %zu",
+            len * sizeof(T), s.size());
+    return std::span(reinterpret_cast<T *>(s.data()), len);
 }

255-264: Handle endianness when reading ValueIds from WASM memory.

WASM memory is little-endian; direct reads will break on big-endian hosts. Convert each 32-bit word to host order.


309-328: Ensure deterministic attribute ordering.

Iterating *value.attrs() is not deterministic; for Nix determinism this should be lexicographically sorted by attribute name before serializing into WASM memory.


334-346: Replace assert checks with runtime validation.

The asserts guarding attrIdx and len are compiled out in release builds and can cause OOB reads/writes.


410-418: Make the instancesPre cache thread-safe (and bounded).

A static unordered_map is not thread-safe and can race under parallel evaluation; it also grows unbounded.


433-440: Rename the primop variable for clarity.

primop_fromTOML is misleading; it registers the wasm primop.

🔧 Proposed fix
-static RegisterPrimOp primop_fromTOML(
+static RegisterPrimOp primop_wasm(
packaging/wasmtime.nix (1)

1-13: Add explicit copyright/attribution for nixpkgs copy.

This file appears to be a verbatim copy from nixpkgs; add a copyright/attribution comment to stay compliant.

🔧 Proposed fix
 {
   lib,
   stdenv,
   rust_1_89,
   fetchFromGitHub,
   buildPackages,
   cmake,
   installShellFiles,
   nix-update-script,
   enableShared ? !stdenv.hostPlatform.isStatic,
   enableStatic ? stdenv.hostPlatform.isStatic,
 }:
+# Derived from nixpkgs: pkgs/development/wasmtime/default.nix
+# Please retain upstream copyright and license notices.
 rust_1_89.packages.stable.rustPlatform.buildRustPackage (finalAttrs: {
🧹 Nitpick comments (2)
src/libstore/unix/build/wasi-derivation-builder.cc (2)

5-18: Extract shared helpers to a common location.

Both unwrap() and string2span() are duplicated from src/libexpr/primops/wasm.cc (as noted by the FIXME comments). Consider extracting these to a shared header (e.g., nix/util/wasm-utils.hh) to avoid maintenance burden and potential divergence.


61-71: Consider handling WASM trap errors more gracefully.

If fun->call() fails (e.g., the WASM module traps), unwrap() will throw an Error. This exception will be caught by handleChildException() in the parent's runChild() wrapper, but the error message from a WASM trap may be cryptic. Consider wrapping the call with additional context:

auto result = fun->call(wasmStore.context(), {});
if (!result)
    throw Error("WASM module '%s' failed: %s", drv.builder, result.err().message());

Additionally, _exit(0) is called unconditionally after successful execution, which is correct for the child process flow.

Comment on lines +120 to +127
NixWasmInstance(EvalState & _state, ref<NixWasmInstancePre> _pre)
: state(_state)
, pre(_pre)
, wasmStore(pre->engine)
, wasmCtx(wasmStore)
, instance(unwrap(pre->instancePre.instantiate(wasmCtx)))
, memory_(std::get<Memory>(*instance.get(wasmCtx, "memory")))
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /tmp && find . -name "wasm.cc" -o -name "*.h" | head -20

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

find / -name "wasm.cc" -type f 2>/dev/null | head -5

Repository: DeterminateSystems/nix-src

Length of output: 116


🏁 Script executed:

git ls-files | grep -i wasm

Repository: DeterminateSystems/nix-src

Length of output: 120


🏁 Script executed:

cat -n /home/jailuser/git/src/libexpr/primops/wasm.cc | head -150

Repository: DeterminateSystems/nix-src

Length of output: 4898


🏁 Script executed:

cat -n /home/jailuser/git/src/libexpr/primops/wasm.cc | sed -n '145,160p'

Repository: DeterminateSystems/nix-src

Length of output: 773


🏁 Script executed:

cat -n /home/jailuser/git/src/libexpr/primops/wasm.cc | sed -n '88,104p'

Repository: DeterminateSystems/nix-src

Length of output: 625


Add error handling for memory export validation.

The current code unsafely assumes instance.get(wasmCtx, "memory") returns a valid Memory export without checking. This will throw std::bad_variant_access if the export is missing or has a different type. The getFunction() method (lines 145-154) demonstrates the correct pattern already used in this file—validate existence first, then use std::get_if for safe type checking.

Fix
-    , memory_(std::get<Memory>(*instance.get(wasmCtx, "memory")))
+    , memory_(([&]() -> Memory {
+        auto ext = instance.get(wasmCtx, "memory");
+        if (!ext)
+            throw Error("WASM module '%s' does not export memory", pre->wasmPath);
+        auto mem = std::get_if<Memory>(&*ext);
+        if (!mem)
+            throw Error("export 'memory' of WASM module '%s' is not a memory", pre->wasmPath);
+        return *mem;
+    })())
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
NixWasmInstance(EvalState & _state, ref<NixWasmInstancePre> _pre)
: state(_state)
, pre(_pre)
, wasmStore(pre->engine)
, wasmCtx(wasmStore)
, instance(unwrap(pre->instancePre.instantiate(wasmCtx)))
, memory_(std::get<Memory>(*instance.get(wasmCtx, "memory")))
{
NixWasmInstance(EvalState & _state, ref<NixWasmInstancePre> _pre)
: state(_state)
, pre(_pre)
, wasmStore(pre->engine)
, wasmCtx(wasmStore)
, instance(unwrap(pre->instancePre.instantiate(wasmCtx)))
, memory_(([&]() -> Memory {
auto ext = instance.get(wasmCtx, "memory");
if (!ext)
throw Error("WASM module '%s' does not export memory", pre->wasmPath);
auto mem = std::get_if<Memory>(&*ext);
if (!mem)
throw Error("export 'memory' of WASM module '%s' is not a memory", pre->wasmPath);
return *mem;
})())
{
🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 120 - 127, The constructor
NixWasmInstance currently assumes instance.get(wasmCtx, "memory") yields a
Memory and directly uses std::get, which can throw std::bad_variant_access;
replace this with the same safe pattern used in getFunction(): call
instance.get(wasmCtx, "memory"), check the optional export exists, then use
std::get_if<Memory> on the contained Export to validate the type, and if missing
or wrong type throw a clear runtime_error (or use unwrap) with a descriptive
message; set memory_ only after successful validation.

Comment on lines +162 to +165
auto memory()
{
return memory_.data(wasmCtx);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, check the file exists and get its size
wc -l src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 101


🏁 Script executed:

# Read the context around lines 162-165 to see the memory() function and surrounding code
sed -n '150,200p' src/libexpr/primops/wasm.cc | cat -n

Repository: DeterminateSystems/nix-src

Length of output: 1918


🏁 Script executed:

# Search for all instances of memory().subspan in the file
rg 'memory\(\)\.subspan' src/libexpr/primops/wasm.cc -n

Repository: DeterminateSystems/nix-src

Length of output: 744


🏁 Script executed:

# Look for any existing bounds checking or validation in the file
rg 'bounds|check|size\(\)|out.of.bounds' src/libexpr/primops/wasm.cc -n -i

Repository: DeterminateSystems/nix-src

Length of output: 915


🏁 Script executed:

# Get the full context of the subspan helper and surrounding functions
sed -n '35,100p' src/libexpr/primops/wasm.cc | cat -n

Repository: DeterminateSystems/nix-src

Length of output: 2353


🏁 Script executed:

# Look for the template definition of subspan
rg 'template.*subspan|subspan.*template' src/libexpr/primops/wasm.cc -A 5 -B 2

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

# Get context around the unchecked calls to understand their safety
sed -n '160,250p' src/libexpr/primops/wasm.cc | cat -n

Repository: DeterminateSystems/nix-src

Length of output: 3206


🏁 Script executed:

# Check if there's any bounds validation in the calling WASM code or caller context
rg 'ptr.*len|memory.*check|validate' src/libexpr/primops/wasm.cc -i -B 2 -A 2

Repository: DeterminateSystems/nix-src

Length of output: 2681


🏁 Script executed:

# Verify memory_.data() signature to understand what it returns
rg 'memory_\.' src/libexpr/primops/wasm.cc -B 3 -A 1 | head -30

Repository: DeterminateSystems/nix-src

Length of output: 138


🏁 Script executed:

# Check if there are any existing overflow checks or safe access patterns elsewhere in the code
rg 'uint32_t.*ptr.*uint32_t.*len' src/libexpr/primops/wasm.cc | wc -l

Repository: DeterminateSystems/nix-src

Length of output: 71


🏁 Script executed:

# Check the actual span subspan method to understand its signature
rg 'std::span|\.subspan' src/libexpr/primops/wasm.cc | head -20

Repository: DeterminateSystems/nix-src

Length of output: 1151


Add bounds-checked access for WASM memory pointers.

WASM supplies untrusted ptr/len values via seven functions: panic(), warn(), make_string(), make_list(), make_attrset(), copy_attrname(), and call_function(). These functions use memory().subspan(ptr, len) without runtime bounds validation, causing out-of-bounds access and undefined behavior in release builds. The custom subspan() template helper (line 51) relies on assert() which is disabled at runtime, providing no protection. Add a checked helper function that validates ptr + len ≤ memory.size() and apply it to all memory access sites.

Suggested helper implementation
     auto memory()
     {
         return memory_.data(wasmCtx);
     }
+
+    std::span<uint8_t> checkedSubspan(uint32_t ptr, uint32_t len)
+    {
+        auto mem = memory();
+        uint64_t end = (uint64_t) ptr + (uint64_t) len;
+        if (end > mem.size())
+            throw Error("WASM memory out of bounds: ptr=%u len=%u size=%zu", ptr, len, mem.size());
+        return mem.subspan(ptr, len);
+    }

Apply to all nine memory access locations, including reads and writes at lines 169, 178, 226, 234, 257, 275, 322, 347, and 358.

🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 162 - 165, The code uses
memory().subspan(ptr, len) without runtime bounds checks, which allows OOB
access when WASM passes untrusted ptr/len; add a checked helper (e.g., a
function check_subspan(ptr, len) that fetches memory_.data(wasmCtx), verifies
ptr + len <= memory.size() and throws or returns an error/empty optional on
violation) and replace all direct memory().subspan(...) usages in the WASM entry
points (panic(), warn(), make_string(), make_list(), make_attrset(),
copy_attrname(), call_function() and any other places referencing
memory().subspan) to call this helper before accessing data so reads/writes are
guarded and undefined behavior is prevented. Ensure the helper uses the same
wasmCtx/memory_ access path (memory_.data(wasmCtx)) and consistently handles
errors (e.g., return an expected/optional or throw) so callers perform no
unchecked subspan operations.

Comment on lines +182 to +186
uint32_t get_type(ValueId valueId)
{
auto & value = *values.at(valueId);
state.forceValue(value, noPos);
auto t = value.type();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and the specific lines mentioned
head -200 src/libexpr/primops/wasm.cc | tail -50

Repository: DeterminateSystems/nix-src

Length of output: 1516


🏁 Script executed:

# Get full context around lines 182-186
sed -n '170,200p' src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 961


🏁 Script executed:

# Find all usages of values.at() in the file to understand scope
rg "values\.at\(" src/libexpr/primops/wasm.cc -n

Repository: DeterminateSystems/nix-src

Length of output: 1001


🏁 Script executed:

# Check error handling patterns in the file (how Error is thrown)
rg "throw Error\(" src/libexpr/primops/wasm.cc -n -A 2 | head -40

Repository: DeterminateSystems/nix-src

Length of output: 652


🏁 Script executed:

# Look at the class definition to understand values container
rg -A 20 "struct.*Evaluator\|class.*Evaluator" src/libexpr/primops/wasm.cc | head -40

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

# Find the values container declaration
rg "std::vector.*values\|values.*std::vector" src/libexpr/primops/wasm.cc -B 2 -A 2

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

# Check if there's a struct/class definition showing values member
sed -n '120,180p' src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 1794


🏁 Script executed:

# Look for any exception handlers that might catch std::out_of_range
rg "catch.*out_of_range|out_of_range" src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

# Check if std::out_of_range is caught anywhere in the broader codebase
rg "catch.*std::out_of_range|catch.*out_of_range" . -t cc -t h | head -20

Repository: DeterminateSystems/nix-src

Length of output: 96


🏁 Script executed:

# Find the full class/struct definition of NixWasmInstance to see values declaration
sed -n '100,170p' src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 1956


🏁 Script executed:

# Search for the values member in the private section
rg "private:|std::vector.*values" src/libexpr/primops/wasm.cc -B 2 -A 5

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

# Check the entire file structure - look for class definition
ast-grep --pattern 'struct $$$' src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 16875


🏁 Script executed:

# Just look at the full file to understand its structure
wc -l src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 101


🏁 Script executed:

# Read the beginning of the file to see the full class definition
head -150 src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 3848


🏁 Script executed:

# Find ValueVector definition
rg "using ValueVector|typedef.*ValueVector" src/libexpr/

Repository: DeterminateSystems/nix-src

Length of output: 333


🏁 Script executed:

# Also check the includes to understand the type better
head -10 src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 213


Add bounds checking for WASM-supplied ValueId values.

WASM functions can pass arbitrary ValueId values to methods like get_type(), get_int(), etc. The current code uses values.at(valueId) which throws std::out_of_range (a non-Error exception) on invalid IDs. This is inconsistent with the codebase's uniform use of Error for exception handling. Add a bounds-checking helper that throws Error, and apply it throughout (at least 11 usages across get_type, get_int, get_float, copy_string, get_bool, make_list, copy_list, make_attrset, copy_attrset, copy_attrname, and call_function).

🔧 Proposed helper + example usage
+    Value * requireValue(ValueId valueId)
+    {
+        if ((size_t) valueId >= values.size())
+            throw Error("invalid ValueId %u (size=%zu)", valueId, values.size());
+        return values[valueId];
+    }

Example replacement (apply throughout):

-        auto & value = *values.at(valueId);
+        auto & value = *requireValue(valueId);
🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 182 - 186, Add a bounds-check
helper (e.g., validateValueId or checkValueId) that checks the supplied ValueId
against values.size() and throws the codebase Error type on out-of-range IDs,
then replace direct uses of values.at(valueId) with a call to that helper
followed by safe indexed access (values[valueId]) in all WASM-exposed accessors:
get_type, get_int, get_float, copy_string, get_bool, make_list, copy_list,
make_attrset, copy_attrset, copy_attrname, and call_function so WASM-supplied
IDs produce a uniform Error exception instead of std::out_of_range.

Comment on lines +369 to +379
template<typename R, typename... Args>
static void regFun(Linker & linker, std::string_view name, R (NixWasmInstance::*f)(Args...))
{
unwrap(linker.func_wrap("env", name, [f](Caller caller, Args... args) -> Result<R, Trap> {
try {
auto instance = std::any_cast<NixWasmInstance *>(caller.context().get_data());
return (*instance.*f)(args...);
} catch (Error & e) {
return Trap(e.what());
}
}));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /tmp && git ls-files | head -20

Repository: DeterminateSystems/nix-src

Length of output: 138


🏁 Script executed:

cat -n src/libexpr/primops/wasm.cc | sed -n '360,390p'

Repository: DeterminateSystems/nix-src

Length of output: 1329


🏁 Script executed:

cat -n src/libexpr/primops/wasm.cc | sed -n '200,368p'

Repository: DeterminateSystems/nix-src

Length of output: 6079


Catch non-Error exceptions in host callbacks.

std::any_cast and values.at() can throw std::exception subclasses (e.g., std::bad_any_cast, std::out_of_range), which would escape uncaught and potentially terminate the host. Add handlers for std::exception and unknown exceptions.

🔧 Proposed fix
     unwrap(linker.func_wrap("env", name, [f](Caller caller, Args... args) -> Result<R, Trap> {
         try {
             auto instance = std::any_cast<NixWasmInstance *>(caller.context().get_data());
             return (*instance.*f)(args...);
         } catch (Error & e) {
             return Trap(e.what());
+        } catch (const std::exception & e) {
+            return Trap(e.what());
+        } catch (...) {
+            return Trap("unknown host exception");
         }
     }));
 }
🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 369 - 379, The regFun host callback
only catches Error exceptions, so std::any_cast, container access, or other
operations can throw std::exception-derived or unknown exceptions and escape;
update the lambda in regFun (the callback created in func_wrap) to add
catch(const std::exception& e) that returns Trap(e.what()) and a final
catch(...) that returns a generic Trap with an explanatory message (e.g.,
"unknown exception in host callback") so all exceptions from the Caller/context
access and method invocation are converted to Traps.

@github-actions github-actions bot temporarily deployed to pull request January 19, 2026 17:11 Inactive
@github-actions github-actions bot temporarily deployed to pull request January 19, 2026 18:46 Inactive
@github-actions github-actions bot temporarily deployed to pull request January 19, 2026 19:52 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/libexpr/primops/wasm.cc`:
- Around line 447-449: The code directly indexes the WASM return with
instance.runFunction(...).at(0).i32() which can throw if there are no returns or
the type isn't i32; update the block around instance.runFunction,
instance.values, addValue and state.forceValue to first capture the returned
vector, verify its size is >= 1, then inspect the first Value's type (the API's
type/kind accessor) to ensure it's an i32 before calling .i32(); if either check
fails, convert the condition into a clear Error (throwing an Error with a
descriptive message) so the existing catch(Error &e) handles it rather than
letting std::out_of_range or wasmtime exceptions escape.
♻️ Duplicate comments (10)
src/libexpr/primops/wasm.cc (10)

48-53: Replace assert with a runtime bounds check in subspan.

assert is compiled out in release builds, so this can silently allow OOB access.

🔧 Proposed fix
 template<typename T>
 static std::span<T> subspan(std::span<uint8_t> s, size_t len)
 {
-    assert(s.size() >= len * sizeof(T));
+    if (s.size() < len * sizeof(T))
+        throw Error("buffer too small: expected at least %zu bytes, got %zu",
+            len * sizeof(T), s.size());
     return std::span((T *) s.data(), len);
 }

120-127: Validate the memory export before std::get.

std::get<Memory> will throw if the export is missing or wrong type; translate that into a proper Error.

🔧 Proposed fix
-    , memory_(std::get<Memory>(*instance.get(wasmCtx, "memory")))
+    , memory_(([&]() -> Memory {
+        auto ext = instance.get(wasmCtx, "memory");
+        if (!ext)
+            throw Error("WASM module '%s' does not export memory", pre->wasmPath);
+        auto mem = std::get_if<Memory>(&*ext);
+        if (!mem)
+            throw Error("export 'memory' of WASM module '%s' is not a memory", pre->wasmPath);
+        return *mem;
+    })())

167-179: Add bounds-checked memory access for ptr/len.

WASM provides untrusted pointers; direct memory().subspan(ptr, len) can go OOB in release builds.

🔧 Proposed fix (helper + usage)
     auto memory()
     {
         return memory_.data(wasmCtx);
     }
+
+    std::span<uint8_t> checkedSubspan(uint32_t ptr, uint32_t len)
+    {
+        auto mem = memory();
+        uint64_t end = (uint64_t) ptr + (uint64_t) len;
+        if (end > mem.size())
+            throw Error("WASM memory out of bounds: ptr=%u len=%u size=%zu", ptr, len, mem.size());
+        return mem.subspan(ptr, len);
+    }

Then replace every memory().subspan(ptr, len) with checkedSubspan(ptr, len) (and the subspan<T> wrappers on top of it).

Also applies to: 223-238, 255-258, 269-276, 284-303, 309-323, 334-348, 352-359, 368-374


182-247: Validate ValueId bounds and throw Error.

values.at() throws std::out_of_range, which bypasses the project’s Error handling strategy.

🔧 Proposed fix
+    Value * requireValue(ValueId valueId)
+    {
+        if ((size_t) valueId >= values.size())
+            throw Error("invalid ValueId %u (size=%zu)", valueId, values.size());
+        return values[valueId];
+    }

Apply:

-    auto & value = *values.at(valueId);
+    auto & value = *requireValue(valueId);

Also applies to: 255-266, 269-279, 284-304, 309-312, 334-338, 352-359, 373-379


255-264: Handle little‑endian conversion when reading IDs from WASM memory.

WASM is little‑endian; reading ValueId/struct fields directly is incorrect on big‑endian hosts.

Also applies to: 269-279, 284-304, 309-328


324-328: Sort attributes to ensure deterministic order.

The FIXME notes non‑determinism; attribute ordering should be stable across evaluations.


334-345: Replace assert with runtime checks in copy_attrname.

Release builds drop the asserts, which can cause OOB or buffer mismatches.

🔧 Proposed fix
-    assert((size_t) attrIdx < attrs.size());
+    if ((size_t) attrIdx >= attrs.size())
+        throw Error("attribute index %u out of bounds (size: %zu)", attrIdx, attrs.size());

     std::string_view name = state.symbols[attrs[attrIdx].name];

-    assert((size_t) len == name.size());
+    if ((size_t) len != name.size())
+        throw Error("buffer size mismatch: expected %zu bytes, got %u", name.size(), len);

389-398: Trap non‑Error exceptions in host callbacks.

std::any_cast and container access can throw std::exception subclasses. Convert them into Traps.

🔧 Proposed fix
     unwrap(linker.func_wrap("env", name, [f](Caller caller, Args... args) -> Result<R, Trap> {
         try {
             auto instance = std::any_cast<NixWasmInstance *>(caller.context().get_data());
             return (*instance.*f)(args...);
         } catch (Error & e) {
             return Trap(e.what());
+        } catch (const std::exception & e) {
+            return Trap(e.what());
+        } catch (...) {
+            return Trap("unknown host exception");
         }
     }));

431-438: Make the instancesPre cache thread‑safe (and bounded).

Static unordered_map is unsafe under parallel evaluation and grows unbounded.


457-463: Rename primop_fromTOML to match its purpose.

The identifier is misleading for a WASM primop.

🔧 Proposed fix
-static RegisterPrimOp primop_fromTOML(
+static RegisterPrimOp primop_wasm(

Comment on lines +447 to +449
auto vRes =
instance.values.at(instance.runFunction(functionName, {(int32_t) instance.addValue(args[2])}).at(0).i32());
state.forceValue(*vRes, pos);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and locate the problematic code
find . -name "wasm.cc" -type f

Repository: DeterminateSystems/nix-src

Length of output: 99


🏁 Script executed:

# Read the specific lines in question with context
head -n 460 src/libexpr/primops/wasm.cc | tail -n 30

Repository: DeterminateSystems/nix-src

Length of output: 1188


🏁 Script executed:

# Get broader context around these lines to understand the function
sed -n '420,470p' src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 1832


🏁 Script executed:

# Search for the definition of runFunction to understand what it returns
rg "runFunction" src/libexpr/primops/wasm.cc -A 5 -B 2

Repository: DeterminateSystems/nix-src

Length of output: 737


🏁 Script executed:

# Check for any validation of WASM return values in the file
rg "at\(0\)" src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 189


🏁 Script executed:

# Search for similar patterns where WASM returns are handled
rg "\.i32\(\)" src/libexpr/primops/wasm.cc -B 3 -A 1

Repository: DeterminateSystems/nix-src

Length of output: 302


🏁 Script executed:

# Search for Val type definition
rg "struct Val|class Val" src/libexpr/primops/wasm.cc -A 10

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

# Search for i32() method definition
rg "\.i32\(" src/libexpr/primops/wasm.cc -B 2 -A 2

Repository: DeterminateSystems/nix-src

Length of output: 267


🏁 Script executed:

# Look for the Val class or struct definition more broadly
rg "i32\(\)" src/ --type cc -B 5 -A 5 | head -50

Repository: DeterminateSystems/nix-src

Length of output: 96


🏁 Script executed:

# Check what unwrap does and if it validates
rg "unwrap" src/libexpr/primops/wasm.cc -B 2 -A 3

Repository: DeterminateSystems/nix-src

Length of output: 1090


🏁 Script executed:

# Look for Val definition in header files
find . -name "*.h" -o -name "*.hh" | xargs grep -l "struct Val\|class Val" 2>/dev/null | head -5

Repository: DeterminateSystems/nix-src

Length of output: 294


🏁 Script executed:

# Look at the includes in wasm.cc
head -50 src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 1166


🏁 Script executed:

# Search for Val in all files
rg "struct Val\|class Val" . --type-list | head -20

Repository: DeterminateSystems/nix-src

Length of output: 459


🏁 Script executed:

# Try searching for wasmtime Val
rg "i32()" . -B 3 -A 3 | head -50

Repository: DeterminateSystems/nix-src

Length of output: 557


🏁 Script executed:

# Look for imports or using statements related to wasmtime
rg "using.*Val|#include.*wasmtime" src/libexpr/primops/wasm.cc -A 2 -B 2

Repository: DeterminateSystems/nix-src

Length of output: 308


🏁 Script executed:

# Check if this is wasmtime-cpp library
rg "wasmtime\|wasm\\.h\|Val" src/libexpr/primops/wasm.cc | head -20

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

# Search for other uses of .i32() in the repository to understand patterns
rg "\.i32()" . -B 3 -A 2

Repository: DeterminateSystems/nix-src

Length of output: 501


🏁 Script executed:

# Check if there's any validation before accessing vector results
rg "runFunction" . -B 2 -A 5 | head -60

Repository: DeterminateSystems/nix-src

Length of output: 1307


🏁 Script executed:

# Look for test files or examples that might show expected usage
find . -name "*wasm*test*" -o -name "*test*wasm*"

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

# Check if there's any documentation or comments about return validation
rg "WASM|wasm" src/libexpr/primops/wasm.cc | grep -i "return\|result\|valid"

Repository: DeterminateSystems/nix-src

Length of output: 639


🏁 Script executed:

# Search for any wrapper or validation around Val.i32() in the codebase
rg "Val.*i32|i32.*Val" . -B 3 -A 3

Repository: DeterminateSystems/nix-src

Length of output: 557


🏁 Script executed:

# Look for the actual code handling values.at() to see if there's pattern matching
rg "values\.at" src/libexpr/primops/wasm.cc -B 5 -A 5

Repository: DeterminateSystems/nix-src

Length of output: 4517


🏁 Script executed:

# Check what the .at() method on std::vector throws
# and whether the catch block would catch it
rg "catch.*std\|catch.*exception" src/libexpr/primops/wasm.cc -B 2 -A 2

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

# Get the full context of the try-catch block to understand scope
sed -n '400,460p' src/libexpr/primops/wasm.cc

Repository: DeterminateSystems/nix-src

Length of output: 2698


🏁 Script executed:

# Verify if Val.i32() can throw exceptions by looking at wasmtime documentation or patterns
# Check if there are any type checks before calling i32()
rg "i32()|type()" src/libexpr/primops/wasm.cc -B 2 -A 2

Repository: DeterminateSystems/nix-src

Length of output: 1299


🏁 Script executed:

# Look for any validation pattern before accessing results
rg "empty()|size()" src/libexpr/primops/wasm.cc -B 3 -A 1

Repository: DeterminateSystems/nix-src

Length of output: 1933


Add validation before accessing WASM function return value type and arity.

The code chains .at(0).i32() without validating that the function returns at least one value or that it's an i32 type. If either check fails, std::out_of_range or a wasmtime exception will be thrown—neither caught by the existing catch (Error & e) block. Check the return vector size and validate the type explicitly before accessing the result, converting any failures to a clear Error.

🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 447 - 449, The code directly
indexes the WASM return with instance.runFunction(...).at(0).i32() which can
throw if there are no returns or the type isn't i32; update the block around
instance.runFunction, instance.values, addValue and state.forceValue to first
capture the returned vector, verify its size is >= 1, then inspect the first
Value's type (the API's type/kind accessor) to ensure it's an i32 before calling
.i32(); if either check fails, convert the condition into a clear Error
(throwing an Error with a descriptive message) so the existing catch(Error &e)
handles it rather than letting std::out_of_range or wasmtime exceptions escape.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/libexpr/primops/wasm.cc`:
- Around line 420-437: The read_file function can overflow the WASM memory check
and truncate large files: ensure you validate bounds using size_t arithmetic
before copying by checking that contents.size() <= len AND that ptr +
contents.size() does not wrap and is <= memory().size(); perform the
pointer+length check with a non-wrapping check (e.g., ptr <= memory().size() &&
contents.size() <= memory().size() - ptr) and use a safe min to copy only up to
the provided len, and handle/return sizes > UINT32_MAX explicitly (either clamp
to UINT32_MAX or return an error) so the uint32_t return and memcpy cannot
silently overflow or truncate. Use the function name read_file, the
pathValue/path.readFile() result (contents), and memory() when locating the
changes.
♻️ Duplicate comments (2)
src/libexpr/primops/wasm.cc (2)

496-504: Verify WASM function return value handling.

The chained access instance.runFunction(...).at(0).i32() at line 502 assumes the WASM function returns exactly one i32 value. This was flagged in past reviews. Additionally, the FIXME at line 498 about using the "start" function represents incomplete functionality.

Consider adding explicit validation:

#!/bin/bash
# Check if there's any documentation or tests showing expected WASM function signatures
rg -n "nix_wasm_init_v1|runFunction" --type=cpp -C 3

511-518: Rename misleading variable name.

The variable primop_fromTOML should be primop_wasm to match the registered primop name. This was flagged in past reviews.

🧹 Nitpick comments (2)
src/libexpr/primops/wasm.cc (2)

94-104: Consider refactoring statement expression for portability.

The constructor uses a GNU C++ statement expression ({...}) (lines 97-101) in the member initializer list, which is a non-standard extension. While GCC and Clang support it, this reduces portability.

♻️ Refactor to use a static factory or helper method
+private:
+    static InstancePre createInstancePre(const SourcePath& wasmPath, Engine& engine)
+    {
+        Linker linker(engine);
+        regFuns(linker);
+        return unwrap(instantiate_pre(linker, unwrap(Module::compile(engine, string2span(wasmPath.readFile())))));
+    }
+
+public:
     NixWasmInstancePre(SourcePath _wasmPath)
         : engine(getEngine())
         , wasmPath(_wasmPath)
-        , instancePre(({
-            Linker linker(engine);
-            regFuns(linker);
-            unwrap(instantiate_pre(linker, unwrap(Module::compile(engine, string2span(wasmPath.readFile())))));
-        }))
+        , instancePre(createInstancePre(_wasmPath, engine))
     {
     }

182-197: Type mismatch in error-throwing lambda.

The lambda at line 196 returns int but is used in a context expecting uint32_t. While this works due to implicit conversion, it's clearer to match the expected type.

-                               : []() -> int { throw Error("unsupported type"); }();
+                               : []() -> uint32_t { throw Error("unsupported type"); }();

Comment on lines +420 to +437
uint32_t read_file(ValueId pathId, uint32_t ptr, uint32_t len)
{
auto & pathValue = *values.at(pathId);
state.forceValue(pathValue, noPos);
if (pathValue.type() != nPath)
throw Error("read_file expects a path value");

auto path = pathValue.path();
auto contents = path.readFile();

// FIXME: this is an inefficient interface since it may cause the file to be read twice.
if (contents.size() <= len) {
auto buf = memory().subspan(ptr, len);
memcpy(buf.data(), contents.data(), contents.size());
}

return contents.size();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider potential integer overflow in buffer size comparison.

When len is close to UINT32_MAX, the comparison contents.size() <= len could pass even if the actual copy would overflow the WASM linear memory. The memory bounds checking issue flagged in past reviews applies here as well.

Additionally, returning the file size as uint32_t could silently truncate files larger than 4GB.

🔧 Suggested bounds validation
     uint32_t read_file(ValueId pathId, uint32_t ptr, uint32_t len)
     {
         auto & pathValue = *values.at(pathId);
         state.forceValue(pathValue, noPos);
         if (pathValue.type() != nPath)
             throw Error("read_file expects a path value");

         auto path = pathValue.path();
         auto contents = path.readFile();

+        if (contents.size() > std::numeric_limits<uint32_t>::max())
+            throw Error("file too large for WASM (size: %zu)", contents.size());
+
         // FIXME: this is an inefficient interface since it may cause the file to be read twice.
         if (contents.size() <= len) {
             auto buf = memory().subspan(ptr, len);
             memcpy(buf.data(), contents.data(), contents.size());
         }

         return contents.size();
     }
🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 420 - 437, The read_file function
can overflow the WASM memory check and truncate large files: ensure you validate
bounds using size_t arithmetic before copying by checking that contents.size()
<= len AND that ptr + contents.size() does not wrap and is <= memory().size();
perform the pointer+length check with a non-wrapping check (e.g., ptr <=
memory().size() && contents.size() <= memory().size() - ptr) and use a safe min
to copy only up to the provided len, and handle/return sizes > UINT32_MAX
explicitly (either clamp to UINT32_MAX or return an error) so the uint32_t
return and memcpy cannot silently overflow or truncate. Use the function name
read_file, the pathValue/path.readFile() result (contents), and memory() when
locating the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants