-
Notifications
You must be signed in to change notification settings - Fork 6
WASM support #309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
WASM support #309
Conversation
It adds ~400 MB to the Nix closure which is a lot.
Also, require WASM plugins to export a `nix_wasm_init_v1` function.
We don't want to reuse instances between WASM calls, since that's impure: the WASM function could use some mutable global state to return a different value every time. So instead, pre-instantiate instances, allowing them to be instantiated efficiently. Unfortunately, while this is much faster than not using pre-instantiation, it is slower than reusing an instance: about 1.3s compared to 0.7s for the call.nix test in nix-wasm-plugin-test (which performs 100K WASM calls).
📝 WalkthroughWalkthroughAdds Wasmtime packaging and linking, implements a Wasmtime-backed wasm primop and a WASI derivation builder for wasm32-wasip1, updates build inputs/linker flags and path-realization API, and introduces experimental flags for wasm builtins and wasm-targeted derivations. Changes
Sequence Diagram(s)sequenceDiagram
participant User as Nix User
participant Eval as EvalState
participant Cache as InstanceCache
participant Engine as Wasmtime Engine/Linker
participant Module as WASM Module
participant Instance as WASM Instance
participant Memory as WASM Memory
participant Host as Host Functions
User->>Eval: prim_wasm(path, entry, arg)
Eval->>Cache: lookup(path)
alt cached
Cache-->>Eval: pre-instantiated data
else not cached
Eval->>Engine: create Engine & Linker
Engine->>Host: register host functions
Eval->>Module: load module from path
Engine->>Module: compile & instantiate with Linker
Module->>Instance: create instance & memory
Cache-->>Eval: store pre-instantiation
end
Eval->>Instance: locate entry function
Eval->>Memory: write arg into memory
Eval->>Instance: call entry(ptr)
Instance->>Host: call host functions as needed
Instance-->>Eval: return result ptr
Eval->>Memory: read result
Eval-->>User: return Nix Value
sequenceDiagram
participant Store as Nix Store
participant Dispatcher as Builder Dispatcher
participant WasiBuilder as WASI Derivation Builder
participant Wasmtime as Wasmtime Engine
participant WasmModule as WASM Module
participant Instance as WASM Instance
participant WasiEnv as WASI Environment
Store->>Dispatcher: makeDerivationBuilder(drv: wasm32-wasip1)
Dispatcher->>WasiBuilder: instantiate WASI builder
WasiBuilder->>Wasmtime: new Engine & Linker
WasiBuilder->>WasiEnv: configure WASI (stdio, argv, env, preopens)
WasiBuilder->>WasmModule: load builder wasm path
Wasmtime->>Instance: instantiate module
WasiBuilder->>Instance: locate _start export
WasiBuilder->>Instance: invoke _start()
Instance->>WasiEnv: perform WASI syscalls
Instance-->>WasiBuilder: execution complete
WasiBuilder-->>Store: build finished
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Fix all issues with AI Agents 🤖
In @src/libexpr/primops/wasm.cc:
- Around line 255-267: The code reads ValueId uint32_t values directly from WASM
memory in make_list, copy_list, make_attrset, and copy_attrset, assuming host
endianness; fix this by converting each 32-bit word from little-endian to host
order when constructing ValueId (e.g., apply le32toh or an equivalent portable
little-endian-to-host conversion) for every read from the memory subspan (the
vs[] accesses) so ValueId values are correct on big-endian hosts; update all
occurrences in the listed functions to perform this conversion before indexing
into values or assigning into lists/attrsets.
- Around line 334-350: In copy_attrname, replace the compile-time asserts with
runtime validations: check that attrIdx < attrs.size() and that len ==
name.size(); if either check fails, raise a runtime error (e.g., throw
std::runtime_error with a clear message mentioning valueId, attrIdx, len and
name.size()) or otherwise signal a trap/error path instead of proceeding; only
perform memcpy when both checks pass and keep the rest of the logic
(state.forceAttrs, obtaining name from state.symbols, and writing into memory())
unchanged.
- Around line 433-439: The static variable name `primop_fromTOML` is misleading
for the WASM primop registration; rename the variable to a descriptive
identifier like `primop_wasm` (i.e., change `static RegisterPrimOp
primop_fromTOML(...` to `static RegisterPrimOp primop_wasm(...`) so it matches
the registered `.name = "wasm"` and `.fun = prim_wasm`; update any local
references to this static variable if present in the file to use the new name.
- Around line 309-332: The copy_attrset function writes attributes into WASM
memory using enumerate(*value.attrs()) which is non-deterministic; to fix,
gather the attributes into a temporary vector (e.g., vector of pairs or pointers
to attr), sort that vector lexicographically by the attribute name string via
state.symbols[attr.name], then iterate the sorted vector to populate
buf[n].value = addValue(attr.value) and buf[n].nameLen =
state.symbols[attr.name].size(), preserving the existing buffer handling
(subspan/Attr/maxLen) and return value.attrs()->size().
- Around line 411-418: The static unordered_map instancesPre is not thread-safe
and leaks entries; replace it with a thread-safe concurrent map (use
boost::concurrent_flat_map keyed by SourcePath) and store weak references
instead of owning refs so entries can be reclaimed (e.g., map value -> weak
pointer to NixWasmInstancePre or equivalent weak_ref). Update the lookup/insert
logic around instancesPre and the use of make_ref<NixWasmInstancePre>(wasmPath)
so you first attempt to lock the weak entry, create a new instance only if
expired, and store a weak reference in the concurrent_flat_map; optionally add
an eviction or LRU policy if unbounded growth remains a concern.
- Around line 49-53: The subspan function currently uses assert to check buffer
size which is omitted in release builds; replace the assert in
subspan(std::span<uint8_t> s, size_t len) with a runtime check that throws a
descriptive exception (e.g., std::out_of_range or std::length_error) when
s.size() < len * sizeof(T), include the required header ( <stdexcept> ), and
keep the returned span construction (use reinterpret_cast<T*>(s.data()) for
clarity) so callers get a safe failure instead of potential buffer overruns.
🧹 Nitpick comments (12)
src/libstore/derivation-options.cc (1)
399-400: Consider requiring a system feature for WASM support.The special-casing of "wasm32-wasip1" allows any system to build WASM derivations, which aligns with the PR's goal of system-independent derivations. However, this assumes Wasmtime is always available.
Consider whether wasm32-wasip1 builds should require a system feature (e.g., "wasm" or "wasmtime") to ensure the runtime is actually available on the build machine. This would be checked by the existing
getRequiredSystemFeatureslogic at lines 406-408.🔎 Alternative approach using system features
If you want to gate WASM builds on explicit system capability:
- Have wasm32-wasip1 derivations declare
requiredSystemFeatures = ["wasm"];- Remove the special case here, letting the normal system feature check at lines 406-408 handle it:
- if (drv.platform != settings.thisSystem.get() && drv.platform != "wasm32-wasip1" - && !settings.extraPlatforms.get().count(drv.platform) && !drv.isBuiltin()) + if (drv.platform != settings.thisSystem.get() + && !settings.extraPlatforms.get().count(drv.platform) && !drv.isBuiltin()) return false;This would make WASM support more explicit and consistent with how other cross-compilation scenarios are handled.
src/libstore/meson.build (1)
383-383: Consider using Meson's dependency system instead of hardcoded linker flags.Directly appending
-lwasmtimetolink_argsbypasses Meson's dependency resolution, which can lead to portability and configuration issues:
- No verification that the wasmtime library is actually available
- Library search paths may not be configured correctly
- Platform-specific library naming conventions (e.g.,
.dllvs.so) are not handled- Build will fail with cryptic linker errors rather than clear dependency errors
🔎 Recommended approach using Meson dependency system
Near the top of the file with other dependencies (around line 109-161), add:
+wasmtime = dependency('wasmtime', required : true) +deps_private += wasmtimeThen remove the hardcoded linker flag:
- link_args : linker_export_flags + [ '-lwasmtime' ], + link_args : linker_export_flags,This provides better error messages, respects pkg-config paths, and handles platform differences automatically.
src/libexpr/meson.build (1)
242-242: Consider using Meson's dependency system instead of hardcoded linker flags.This has the same portability concerns as the identical change in
src/libstore/meson.buildat line 383. Hardcoded-lwasmtimebypasses Meson's dependency resolution and can cause issues with library paths, platform-specific naming, and error reporting.🔎 Recommended approach using Meson dependency system
Near the dependencies section (around lines 41-92), add:
+wasmtime = dependency('wasmtime', required : true) +deps_private += wasmtimeThen remove the hardcoded linker flag:
- link_args : linker_export_flags + [ '-lwasmtime' ], + link_args : linker_export_flags,This approach provides better error messages, respects pkg-config configurations, and handles cross-platform differences automatically.
src/libstore/unix/build/derivation-builder.cc (1)
2033-2035: Early return forwasm32-wasip1bypasses later sandbox / uid‑range checksRouting
params.drv.platform == "wasm32-wasip1"directly toWasiDerivationBuilderhappens before:
- The diverted store check (
storeDir != realStoreDir) and- The
!useSandbox && params.drvOptions.useUidRange(drv)guard.That means for WASI builds:
uid-rangerequests will no longer be rejected when sandboxing is effectively disabled here.- Diverted stores on non‑Linux platforms will also be allowed for this path, unlike other builders.
If this is intentional (relying on the WASI runtime rather than OS sandboxing), consider making that explicit and either:
- Reject
uid-rangeinWasiDerivationBuilder’s path (or in this branch) so configuration errors are surfaced, or- Document that
uid-rangeis ignored forwasm32-wasip1and that OS sandboxing semantics differ from other systems.src/libexpr/primops.cc (1)
168-188: MakingrealisePathnon‑static for reuse is reasonableExposing
SourcePath realisePath(...)(no longerstatic) with the optionalSymlinkResolutionparameter allows other primop modules (e.g. WASM/WASI ones) to share the same path + context resolution logic. The implementation is consistent with existing callers, and the default (Full) preserves previous behavior.Just ensure there is a matching declaration in a header (e.g.
primops.hh) so external uses stay type‑checked and the function remains the single canonical implementation.src/libstore/unix/build/wasi-derivation-builder.cc (1)
5-18: Deduplicateunwrap/string2spanhelpers and tighten const‑correctnessBoth
unwrapandstring2spanappear to be cut‑and‑paste fromsrc/libexpr/primops/wasm.cc, which means future changes to the Wasmtime API or error handling will have to be done in two places.Consider:
- Moving these helpers to a small shared header (e.g. a
wasm-runtime-utils.hh) used by both primops and the WASI builder, and- Switching
string2spanto use astd::span<const uint8_t>(and avoiding the C‑style cast) if the Wasmtime C++ API accepts a const span, to keep const‑correctness and avoid accidental mutation.Please double‑check the current
wasmtime::Module::compilesignature in your pinned Wasmtime version to confirm whetherstd::span<const uint8_t>is acceptable and adjust the helper accordingly.src/libexpr/primops/wasm.cc (6)
6-6: Consider removingusing namespace wasmtime;for clarity.While acceptable in implementation files, explicitly qualifying wasmtime types improves code clarity and prevents potential name collisions with future additions to the nix namespace.
10-15: Refactor: Remove duplicate declaration and include proper header.The
realisePathfunction is already defined insrc/libexpr/primops.cc(lines 167-187). Instead of forward-declaring it here, consider extracting it to a shared header file that both files can include, or making it part of theEvalStateAPI.
94-103: Consider factory method for clearer error handling.The constructor performs heavy operations (WASM compilation via
unwrap()) that can throw. Consider using a static factory method that returnsResult<ref<NixWasmInstancePre>, Error>for clearer error handling and easier testing.
182-196: Simplify error throwing pattern.Line 196 uses an immediately-invoked lambda to throw an error, which is less clear than throwing directly. The ternary chain could also benefit from a switch statement for clarity.
🔎 Proposed refactor
uint32_t get_type(ValueId valueId) { auto & value = *values.at(valueId); state.forceValue(value, noPos); - auto t = value.type(); - return t == nInt ? 1 - : t == nFloat ? 2 - : t == nBool ? 3 - : t == nString ? 4 - : t == nPath ? 5 - : t == nNull ? 6 - : t == nAttrs ? 7 - : t == nList ? 8 - : t == nFunction ? 9 - : []() -> int { throw Error("unsupported type"); }(); + switch (value.type()) { + case nInt: return 1; + case nFloat: return 2; + case nBool: return 3; + case nString: return 4; + case nPath: return 5; + case nNull: return 6; + case nAttrs: return 7; + case nList: return 8; + case nFunction: return 9; + default: throw Error("unsupported type"); + } }
230-238: Consider adding documentation for two-phase copy pattern.This function uses a common but potentially confusing pattern: it only copies data if the buffer is large enough, but always returns the actual size. A brief comment explaining that callers should check
returnValue <= maxLenbefore using the buffer would improve clarity.
426-426: Improve readability by splitting complex expression.Line 426 chains multiple operations in a single expression, making it difficult to read and debug. Consider breaking it into separate steps.
🔎 Proposed refactor
- v = *instance.values.at(instance.runFunction(functionName, {(int32_t) instance.addValue(args[2])}).at(0).i32()); + auto argId = instance.addValue(args[2]); + auto results = instance.runFunction(functionName, {(int32_t) argId}); + auto resultId = results.at(0).i32(); + v = *instance.values.at(resultId);
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
packaging/dependencies.nixpackaging/wasmtime.nixsrc/libexpr/meson.buildsrc/libexpr/package.nixsrc/libexpr/primops.ccsrc/libexpr/primops/meson.buildsrc/libexpr/primops/wasm.ccsrc/libstore/derivation-options.ccsrc/libstore/meson.buildsrc/libstore/unix/build/derivation-builder.ccsrc/libstore/unix/build/wasi-derivation-builder.cc
🧰 Additional context used
🧬 Code graph analysis (3)
src/libstore/derivation-options.cc (1)
src/libstore/include/nix/store/derivation-options.hh (2)
drv(184-184)drv(201-201)
src/libstore/unix/build/wasi-derivation-builder.cc (1)
src/libexpr/primops/wasm.cc (7)
unwrap(20-25)unwrap(20-20)v(131-136)v(131-131)v(138-143)string2span(38-41)string2span(38-38)
src/libexpr/primops/wasm.cc (1)
src/libexpr/primops.cc (4)
state(3218-3224)state(3218-3218)realisePath(168-188)realisePath(168-172)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (6)
src/libexpr/primops/meson.build (1)
12-12: LGTM!Adding wasm.cc to the primops sources is straightforward and correct.
src/libexpr/package.nix (1)
17-17: LGTM!The wasmtime dependency is correctly added following the same pattern as toml11 and other build-time dependencies.
Also applies to: 68-68
packaging/dependencies.nix (1)
78-79: Wiringwasmtimeinto the dependency scope looks correct
wasmtime = pkgs.callPackage ./wasmtime.nix { };is consistent with how other custom packages are exposed here; no further changes needed.src/libstore/unix/build/derivation-builder.cc (2)
294-298: Naming comment onrealPathInSandboxThe
// FIXME: misnomernote correctly reflects that this helper currently just resolves store paths rather than honoring a sandbox path mapping. No functional impact, so this can remain as a low‑priority cleanup marker.
1996-2001: Includingwasi-derivation-builder.ccalongside other platform builders is consistentAdding
#include "wasi-derivation-builder.cc"next to the other platform-specific builder includes matches the existing “single TU with included .cc files” pattern. No issues from an integration standpoint.packaging/wasmtime.nix (1)
1-121: Wasmtime packaging looks solid; minor note on header-copy path robustnessOverall this derivation is well-structured:
- Uses a pinned Rust toolchain (
rust_1_89) andbuildRustPackage.- Restricts
doCheckto platforms with the required SIMD features and skips problematicaarch64runs.- Splits outputs into
out,dev, andlib, pruning.so/.aaccording toenableShared/enableStatic.- Handles shell completions correctly for both native and cross builds.
- Adds an install-time
--versioncheck and an updateScript with an appropriate version regex.One thing to keep in mind is the header copy:
cp -r target/${cargoShortTarget}/release/build/wasmtime-c-api-impl-*/out/include $dev/includeThis relies on the exact crate/build directory naming and
releaseprofile path; if upstream layout or build profile changes, this will break somewhat opaquely. It may be worth:
- Asserting that exactly one
wasmtime-c-api-impl-*match exists, or- Failing with a clearer message if the glob expands to zero or multiple directories.
Otherwise the packaging is in good shape.
Please double-check against your pinned Wasmtime 40.0.0 sources that the
wasmtime-c-api-impl-*build output layout (and thecargoShortTarget/releasepath) is stable; if upstream changes it, thecp -r .../out/includestep will be the first thing to break.
| static std::span<T> subspan(std::span<uint8_t> s, size_t len) | ||
| { | ||
| assert(s.size() >= len * sizeof(T)); | ||
| return std::span((T *) s.data(), len); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Replace assert with proper error handling.
The assert on line 51 is compiled out in release builds, which could lead to buffer overruns if the size check fails. Use a runtime check that throws an exception instead.
🔎 Proposed fix
template<typename T>
static std::span<T> subspan(std::span<uint8_t> s, size_t len)
{
- assert(s.size() >= len * sizeof(T));
+ if (s.size() < len * sizeof(T))
+ throw Error("buffer too small: expected at least %d bytes, got %d", len * sizeof(T), s.size());
return std::span((T *) s.data(), len);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| static std::span<T> subspan(std::span<uint8_t> s, size_t len) | |
| { | |
| assert(s.size() >= len * sizeof(T)); | |
| return std::span((T *) s.data(), len); | |
| } | |
| template<typename T> | |
| static std::span<T> subspan(std::span<uint8_t> s, size_t len) | |
| { | |
| if (s.size() < len * sizeof(T)) | |
| throw Error("buffer too small: expected at least %d bytes, got %d", len * sizeof(T), s.size()); | |
| return std::span((T *) s.data(), len); | |
| } |
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 49-53, The subspan function
currently uses assert to check buffer size which is omitted in release builds;
replace the assert in subspan(std::span<uint8_t> s, size_t len) with a runtime
check that throws a descriptive exception (e.g., std::out_of_range or
std::length_error) when s.size() < len * sizeof(T), include the required header
( <stdexcept> ), and keep the returned span construction (use
reinterpret_cast<T*>(s.data()) for clarity) so callers get a safe failure
instead of potential buffer overruns.
| ValueId make_list(uint32_t ptr, uint32_t len) | ||
| { | ||
| auto vs = subspan<ValueId>(memory().subspan(ptr), len); | ||
|
|
||
| auto [valueId, value] = allocValue(); | ||
|
|
||
| auto list = state.buildList(len); | ||
| for (const auto & [n, v] : enumerate(list)) | ||
| v = values.at(vs[n]); // FIXME: endianness | ||
| value.mkList(list); | ||
|
|
||
| return valueId; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Address endianness assumption for cross-platform correctness.
The FIXME on line 263 notes that ValueIds are read from WASM memory without endianness conversion. WASM uses little-endian, so this code will fail on big-endian host systems. While big-endian systems are rare, this violates Nix's goal of deterministic, cross-platform builds.
The same issue appears in copy_list (line 278), make_attrset (line 296), and copy_attrset (line 322).
Consider using explicit little-endian conversion functions (e.g., le32toh) when reading multi-byte values from WASM memory.
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 255-267, The code reads ValueId
uint32_t values directly from WASM memory in make_list, copy_list, make_attrset,
and copy_attrset, assuming host endianness; fix this by converting each 32-bit
word from little-endian to host order when constructing ValueId (e.g., apply
le32toh or an equivalent portable little-endian-to-host conversion) for every
read from the memory subspan (the vs[] accesses) so ValueId values are correct
on big-endian hosts; update all occurrences in the listed functions to perform
this conversion before indexing into values or assigning into lists/attrsets.
| uint32_t copy_attrset(ValueId valueId, uint32_t ptr, uint32_t maxLen) | ||
| { | ||
| auto & value = *values.at(valueId); | ||
| state.forceAttrs(value, noPos, "while copying an attrset into WASM"); | ||
|
|
||
| if (value.attrs()->size() <= maxLen) { | ||
| // FIXME: endianness. | ||
| struct Attr | ||
| { | ||
| ValueId value; | ||
| uint32_t nameLen; | ||
| }; | ||
|
|
||
| auto buf = subspan<Attr>(memory().subspan(ptr), maxLen); | ||
|
|
||
| // FIXME: for determinism, we should return attributes in lexicographically sorted order. | ||
| for (const auto & [n, attr] : enumerate(*value.attrs())) { | ||
| buf[n].value = addValue(attr.value); | ||
| buf[n].nameLen = state.symbols[attr.name].size(); | ||
| } | ||
| } | ||
|
|
||
| return value.attrs()->size(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix: Ensure deterministic attribute ordering.
The FIXME on line 324 correctly identifies that attributes should be returned in lexicographically sorted order for determinism. Nix's evaluation model requires consistent attribute ordering across evaluations.
🔎 Proposed fix
if (value.attrs()->size() <= maxLen) {
// FIXME: endianness.
struct Attr
{
ValueId value;
uint32_t nameLen;
};
auto buf = subspan<Attr>(memory().subspan(ptr), maxLen);
- // FIXME: for determinism, we should return attributes in lexicographically sorted order.
- for (const auto & [n, attr] : enumerate(*value.attrs())) {
+ // Collect and sort attributes by name for determinism
+ std::vector<std::pair<std::string_view, const Attr*>> sortedAttrs;
+ for (const auto & attr : *value.attrs()) {
+ sortedAttrs.push_back({state.symbols[attr.name], &attr});
+ }
+ std::sort(sortedAttrs.begin(), sortedAttrs.end(),
+ [](const auto & a, const auto & b) { return a.first < b.first; });
+
+ for (const auto & [n, pair] : enumerate(sortedAttrs)) {
+ const auto & attr = *pair.second;
buf[n].value = addValue(attr.value);
buf[n].nameLen = state.symbols[attr.name].size();
}
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 309-332, The copy_attrset function
writes attributes into WASM memory using enumerate(*value.attrs()) which is
non-deterministic; to fix, gather the attributes into a temporary vector (e.g.,
vector of pairs or pointers to attr), sort that vector lexicographically by the
attribute name string via state.symbols[attr.name], then iterate the sorted
vector to populate buf[n].value = addValue(attr.value) and buf[n].nameLen =
state.symbols[attr.name].size(), preserving the existing buffer handling
(subspan/Attr/maxLen) and return value.attrs()->size().
| std::monostate copy_attrname(ValueId valueId, uint32_t attrIdx, uint32_t ptr, uint32_t len) | ||
| { | ||
| auto & value = *values.at(valueId); | ||
| state.forceAttrs(value, noPos, "while copying an attr name into WASM"); | ||
|
|
||
| auto & attrs = *value.attrs(); | ||
|
|
||
| assert((size_t) attrIdx < attrs.size()); | ||
|
|
||
| std::string_view name = state.symbols[attrs[attrIdx].name]; | ||
|
|
||
| assert((size_t) len == name.size()); | ||
|
|
||
| memcpy(memory().subspan(ptr, len).data(), name.data(), name.size()); | ||
|
|
||
| return {}; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Replace assertions with runtime checks.
Lines 341 and 345 use assert statements that are compiled out in release builds. This can lead to:
- Line 341: Out-of-bounds array access if
attrIdx >= attrs.size() - Line 345: Buffer overflow if
len != name.size()
🔎 Proposed fix
std::monostate copy_attrname(ValueId valueId, uint32_t attrIdx, uint32_t ptr, uint32_t len)
{
auto & value = *values.at(valueId);
state.forceAttrs(value, noPos, "while copying an attr name into WASM");
auto & attrs = *value.attrs();
- assert((size_t) attrIdx < attrs.size());
+ if ((size_t) attrIdx >= attrs.size())
+ throw Error("attribute index %d out of bounds (size: %d)", attrIdx, attrs.size());
std::string_view name = state.symbols[attrs[attrIdx].name];
- assert((size_t) len == name.size());
+ if ((size_t) len != name.size())
+ throw Error("buffer size mismatch: expected %d bytes, got %d", name.size(), len);
memcpy(memory().subspan(ptr, len).data(), name.data(), name.size());
return {};
}🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 334-350, In copy_attrname, replace
the compile-time asserts with runtime validations: check that attrIdx <
attrs.size() and that len == name.size(); if either check fails, raise a runtime
error (e.g., throw std::runtime_error with a clear message mentioning valueId,
attrIdx, len and name.size()) or otherwise signal a trap/error path instead of
proceeding; only perform memcpy when both checks pass and keep the rest of the
logic (state.forceAttrs, obtaining name from state.symbols, and writing into
memory()) unchanged.
| // FIXME: make thread-safe. | ||
| // FIXME: make this a weak Boehm GC pointer so that it can be freed during GC. | ||
| static std::unordered_map<SourcePath, ref<NixWasmInstancePre>> instancesPre; | ||
|
|
||
| auto instancePre = instancesPre.find(wasmPath); | ||
| if (instancePre == instancesPre.end()) | ||
| instancePre = instancesPre.emplace(wasmPath, make_ref<NixWasmInstancePre>(wasmPath)).first; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if EvalState or prim_* functions are used in concurrent contexts
rg -n "std::thread|std::async|pthread" --type=cpp -C 2 | head -50
# Check for existing thread-safety patterns in primops
rg -n "std::mutex|std::lock_guard|std::atomic" src/libexpr/ --type=cpp -C 2 | head -50Repository: DeterminateSystems/nix-src
Length of output: 5273
🏁 Script executed:
#!/bin/bash
# Look at the actual file context around lines 411-418
cat -n src/libexpr/primops/wasm.cc | sed -n '400,430p'Repository: DeterminateSystems/nix-src
Length of output: 1640
🏁 Script executed:
#!/bin/bash
# Check how EvalState handles concurrent evaluation
rg -n "EvalState.*thread|thread.*EvalState" --type=cpp -C 3 | head -40
# Look for parallel evaluation support
rg -n "parallel|concurrent|thread-safe" src/libexpr/ --type=cpp | grep -i "eval\|prim" | head -20Repository: DeterminateSystems/nix-src
Length of output: 2211
Fix thread-unsafe static cache in prim_wasm.
The static instancesPre map on line 413 is not thread-safe and will race under parallel evaluation. Nix supports parallel evaluation (see builtins.parallel and parallel-eval.cc), making this a critical issue.
The cache grows unbounded and is never freed. Use boost::concurrent_flat_map (already used elsewhere in EvalState for srcToStore and importResolutionCache) and consider weak pointers or LRU eviction to prevent memory bloat.
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 411-418, The static unordered_map
instancesPre is not thread-safe and leaks entries; replace it with a thread-safe
concurrent map (use boost::concurrent_flat_map keyed by SourcePath) and store
weak references instead of owning refs so entries can be reclaimed (e.g., map
value -> weak pointer to NixWasmInstancePre or equivalent weak_ref). Update the
lookup/insert logic around instancesPre and the use of
make_ref<NixWasmInstancePre>(wasmPath) so you first attempt to lock the weak
entry, create a new instance only if expired, and store a weak reference in the
concurrent_flat_map; optionally add an eviction or LRU policy if unbounded
growth remains a concern.
src/libexpr/primops/wasm.cc
Outdated
| static RegisterPrimOp primop_fromTOML( | ||
| {.name = "wasm", | ||
| .args = {"wasm", "entry", "arg"}, | ||
| .doc = R"( | ||
| Call a WASM function with the specified argument. | ||
| )", | ||
| .fun = prim_wasm}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix misleading variable name.
The variable is named primop_fromTOML but actually registers the wasm primop. This appears to be a copy-paste error from another primop definition.
🔎 Proposed fix
-static RegisterPrimOp primop_fromTOML(
+static RegisterPrimOp primop_wasm(
{.name = "wasm",
.args = {"wasm", "entry", "arg"},
- .doc = R"(
- Call a WASM function with the specified argument.
- )",
+ .doc = R"(
+ Call a WASM function from a WebAssembly module.
+
+ Arguments:
+ - wasm: Path to the .wasm file
+ - entry: Name of the exported function to call
+ - arg: Argument to pass to the function (Nix value)
+
+ Returns the result from the WASM function as a Nix value.
+ The WASM module must export a `nix_wasm_init_v1` function.
+ )",
.fun = prim_wasm});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| static RegisterPrimOp primop_fromTOML( | |
| {.name = "wasm", | |
| .args = {"wasm", "entry", "arg"}, | |
| .doc = R"( | |
| Call a WASM function with the specified argument. | |
| )", | |
| .fun = prim_wasm}); | |
| static RegisterPrimOp primop_wasm( | |
| {.name = "wasm", | |
| .args = {"wasm", "entry", "arg"}, | |
| .doc = R"( | |
| Call a WASM function from a WebAssembly module. | |
| Arguments: | |
| - wasm: Path to the .wasm file | |
| - entry: Name of the exported function to call | |
| - arg: Argument to pass to the function (Nix value) | |
| Returns the result from the WASM function as a Nix value. | |
| The WASM module must export a `nix_wasm_init_v1` function. | |
| )", | |
| .fun = prim_wasm}); |
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 433-439, The static variable name
`primop_fromTOML` is misleading for the WASM primop registration; rename the
variable to a descriptive identifier like `primop_wasm` (i.e., change `static
RegisterPrimOp primop_fromTOML(...` to `static RegisterPrimOp primop_wasm(...`)
so it matches the registered `.name = "wasm"` and `.fun = prim_wasm`; update any
local references to this static variable if present in the file to use the new
name.
| void execBuilder(const Strings & args, const Strings & envStrs) override | ||
| { | ||
| using namespace wasmtime; | ||
|
|
||
| Engine engine; | ||
| Linker linker(engine); | ||
| unwrap(linker.define_wasi()); | ||
|
|
||
| WasiConfig wasiConfig; | ||
| wasiConfig.inherit_stdin(); | ||
| wasiConfig.inherit_stdout(); | ||
| wasiConfig.inherit_stderr(); | ||
| wasiConfig.argv(std::vector(args.begin(), args.end())); | ||
| { | ||
| std::vector<std::pair<std::string, std::string>> env2; | ||
| for (auto & [k, v] : env) | ||
| env2.emplace_back(k, rewriteStrings(v, inputRewrites)); | ||
| wasiConfig.env(env2); | ||
| } | ||
| if (!wasiConfig.preopen_dir( | ||
| store.config->realStoreDir.get(), | ||
| store.storeDir, | ||
| WASMTIME_WASI_DIR_PERMS_READ | WASMTIME_WASI_DIR_PERMS_WRITE, | ||
| WASMTIME_WASI_FILE_PERMS_READ | WASMTIME_WASI_FILE_PERMS_WRITE)) | ||
| throw Error("cannot add store directory to WASI config"); | ||
| // FIXME: add temp dir | ||
|
|
||
| auto module = unwrap(Module::compile(engine, string2span(readFile(realPathInSandbox(drv.builder))))); | ||
| wasmtime::Store wasmStore(engine); | ||
| unwrap(wasmStore.context().set_wasi(std::move(wasiConfig))); | ||
| auto instance = unwrap(linker.instantiate(wasmStore, module)); | ||
|
|
||
| auto startName = "_start"; | ||
| auto ext = instance.get(wasmStore, startName); | ||
| if (!ext) | ||
| throw Error("WASM module '%s' does not export function '%s'", drv.builder, startName); | ||
| auto fun = std::get_if<Func>(&*ext); | ||
| if (!fun) | ||
| throw Error("export '%s' of WASM module '%s' is not a function", startName, drv.builder); | ||
|
|
||
| unwrap(fun->call(wasmStore.context(), {})); | ||
|
|
||
| _exit(0); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cd src/libstore/unix/build && wc -l wasi-derivation-builder.ccRepository: DeterminateSystems/nix-src
Length of output: 99
🏁 Script executed:
cd src/libstore/unix/build && head -80 wasi-derivation-builder.ccRepository: DeterminateSystems/nix-src
Length of output: 2592
🏁 Script executed:
rg -n "class.*DerivationBuilder" src/libstore/unix/build/ -A 5Repository: DeterminateSystems/nix-src
Length of output: 510
🏁 Script executed:
rg -n "initEnv" src/libstore/unix/build/ -A 10Repository: DeterminateSystems/nix-src
Length of output: 2704
🏁 Script executed:
rg -n "runChild" src/libstore/unix/build/build-user.cc -A 15Repository: DeterminateSystems/nix-src
Length of output: 147
🏁 Script executed:
rg -n "runChild" src/libstore/ -B 2 -A 15Repository: DeterminateSystems/nix-src
Length of output: 9282
🏁 Script executed:
sed -n '1057,1150p' src/libstore/unix/build/derivation-builder.ccRepository: DeterminateSystems/nix-src
Length of output: 4230
🏁 Script executed:
rg -n "void.*execBuilder" src/libstore/ -B 5 -A 5Repository: DeterminateSystems/nix-src
Length of output: 3761
🏁 Script executed:
sed -n '1276,1360p' src/libstore/unix/build/derivation-builder.ccRepository: DeterminateSystems/nix-src
Length of output: 2796
🏁 Script executed:
rg -n "tmpDirInSandbox" src/libstore/unix/build/ -A 3Repository: DeterminateSystems/nix-src
Length of output: 4675
🏁 Script executed:
sed -n '1340,1390p' src/libstore/unix/build/derivation-builder.ccRepository: DeterminateSystems/nix-src
Length of output: 2017
🏁 Script executed:
rg -n "for.*env\|envStrs" src/libstore/unix/build/derivation-builder.cc -A 3 -B 2 | grep -A 5 "1340\|1350\|1360\|1370"Repository: DeterminateSystems/nix-src
Length of output: 52
🏁 Script executed:
sed -n '1350,1380p' src/libstore/unix/build/derivation-builder.ccRepository: DeterminateSystems/nix-src
Length of output: 1016
Address unused envStrs parameter and duplicate env rewriting
The envStrs parameter is computed in runChild() by iterating over env and applying rewriteStrings() to each key=value pair, then passed to execBuilder(). The WASI override ignores this and rebuilds env2 using identical logic. Reuse envStrs by parsing it to avoid duplication and ensure consistency if env handling changes.
Preopen the temp directory and adjust store permissions
initEnv() sets TMPDIR, NIX_BUILD_TOP, and PWD to tmpDirInSandbox(), but only the store directory is preopened in the WASI config. WASM builders attempting to write to $TMPDIR or access $NIX_BUILD_TOP will fail. Additionally, the store is preopened with READ|WRITE permissions on all files; typical chroot builds expose stores read-only, with only outputs writable. Add a preopen for the temp directory and consider making the store read-only while exposing outputs or scratch locations separately to match existing sandbox assumptions.
This function was named incorrectly. It denotes the path in the host filesystem, not in the sandbox.
Even though it's in a separate output, disabling it reduces the size of libwasmtime.so by around 12 MB.
This reduces the size of the Nix closure by another 12 MB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a copyright notice that points back to nixpkgs, since this is a verbatim copy of the package definition from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/libstore/package.nix (1)
64-69: Guardwasmtimeby platform compatibility.
wasmtimeis marked unix-only in its package metadata (platforms = lib.platforms.unix), but the library declares Windows support. Adding it unconditionally to buildInputs breaks evaluation on Windows. Use a platform guard likelib.optional stdenv.hostPlatform.isUnix.Proposed fix
buildInputs = [ boost curl sqlite - wasmtime ] + ++ lib.optional stdenv.hostPlatform.isUnix wasmtime ++ lib.optional stdenv.hostPlatform.isLinux libseccomp ++ lib.optional withAWS aws-crt-cpp;
🤖 Fix all issues with AI agents
In `@src/libexpr/primops/wasm.cc`:
- Around line 369-379: The regFun host callback only catches Error exceptions,
so std::any_cast, container access, or other operations can throw
std::exception-derived or unknown exceptions and escape; update the lambda in
regFun (the callback created in func_wrap) to add catch(const std::exception& e)
that returns Trap(e.what()) and a final catch(...) that returns a generic Trap
with an explanatory message (e.g., "unknown exception in host callback") so all
exceptions from the Caller/context access and method invocation are converted to
Traps.
- Around line 120-127: The constructor NixWasmInstance currently assumes
instance.get(wasmCtx, "memory") yields a Memory and directly uses std::get,
which can throw std::bad_variant_access; replace this with the same safe pattern
used in getFunction(): call instance.get(wasmCtx, "memory"), check the optional
export exists, then use std::get_if<Memory> on the contained Export to validate
the type, and if missing or wrong type throw a clear runtime_error (or use
unwrap) with a descriptive message; set memory_ only after successful
validation.
- Around line 182-186: Add a bounds-check helper (e.g., validateValueId or
checkValueId) that checks the supplied ValueId against values.size() and throws
the codebase Error type on out-of-range IDs, then replace direct uses of
values.at(valueId) with a call to that helper followed by safe indexed access
(values[valueId]) in all WASM-exposed accessors: get_type, get_int, get_float,
copy_string, get_bool, make_list, copy_list, make_attrset, copy_attrset,
copy_attrname, and call_function so WASM-supplied IDs produce a uniform Error
exception instead of std::out_of_range.
- Around line 162-165: The code uses memory().subspan(ptr, len) without runtime
bounds checks, which allows OOB access when WASM passes untrusted ptr/len; add a
checked helper (e.g., a function check_subspan(ptr, len) that fetches
memory_.data(wasmCtx), verifies ptr + len <= memory.size() and throws or returns
an error/empty optional on violation) and replace all direct
memory().subspan(...) usages in the WASM entry points (panic(), warn(),
make_string(), make_list(), make_attrset(), copy_attrname(), call_function() and
any other places referencing memory().subspan) to call this helper before
accessing data so reads/writes are guarded and undefined behavior is prevented.
Ensure the helper uses the same wasmCtx/memory_ access path
(memory_.data(wasmCtx)) and consistently handles errors (e.g., return an
expected/optional or throw) so callers perform no unchecked subspan operations.
♻️ Duplicate comments (8)
src/libstore/unix/build/wasi-derivation-builder.cc (1)
29-72: Address unusedenvStrsparameter and missing temp directory preopen.Two issues remain from previous review:
Unused parameter: The
envStrsparameter (computed byrunChild()withrewriteStrings()applied) is ignored. Instead, lines 43-46 rebuild the environment by iterating overenvand re-applyingrewriteStrings(). This duplicates work and risks inconsistency if env handling changes upstream. Consider parsingenvStrsdirectly.Missing temp directory: The FIXME at line 54 acknowledges that the temp directory is not preopened. Since
initEnv()setsTMPDIR,NIX_BUILD_TOP, andPWDtotmpDirInSandbox(), WASM builders attempting to access these paths will fail.Suggested fix for temp directory
if (!wasiConfig.preopen_dir( store.config->realStoreDir.get(), store.storeDir, WASMTIME_WASI_DIR_PERMS_READ | WASMTIME_WASI_DIR_PERMS_WRITE, WASMTIME_WASI_FILE_PERMS_READ | WASMTIME_WASI_FILE_PERMS_WRITE)) throw Error("cannot add store directory to WASI config"); - // FIXME: add temp dir + if (!wasiConfig.preopen_dir( + tmpDir, + tmpDirInSandbox(), + WASMTIME_WASI_DIR_PERMS_READ | WASMTIME_WASI_DIR_PERMS_WRITE, + WASMTIME_WASI_FILE_PERMS_READ | WASMTIME_WASI_FILE_PERMS_WRITE)) + throw Error("cannot add temp directory to WASI config");src/libexpr/primops/wasm.cc (6)
48-52: Replaceassertinsubspanwith a runtime check.
assertis compiled out in release builds, so out-of-bounds spans can slip through. Use a real check and throw a descriptiveError.🔧 Proposed fix
template<typename T> static std::span<T> subspan(std::span<uint8_t> s, size_t len) { - assert(s.size() >= len * sizeof(T)); - return std::span((T *) s.data(), len); + if (s.size() < len * sizeof(T)) + throw Error("buffer too small: expected at least %zu bytes, got %zu", + len * sizeof(T), s.size()); + return std::span(reinterpret_cast<T *>(s.data()), len); }
255-264: Handle endianness when reading ValueIds from WASM memory.WASM memory is little-endian; direct reads will break on big-endian hosts. Convert each 32-bit word to host order.
309-328: Ensure deterministic attribute ordering.Iterating
*value.attrs()is not deterministic; for Nix determinism this should be lexicographically sorted by attribute name before serializing into WASM memory.
334-346: Replaceassertchecks with runtime validation.The asserts guarding
attrIdxandlenare compiled out in release builds and can cause OOB reads/writes.
410-418: Make theinstancesPrecache thread-safe (and bounded).A static
unordered_mapis not thread-safe and can race under parallel evaluation; it also grows unbounded.
433-440: Rename the primop variable for clarity.
primop_fromTOMLis misleading; it registers thewasmprimop.🔧 Proposed fix
-static RegisterPrimOp primop_fromTOML( +static RegisterPrimOp primop_wasm(packaging/wasmtime.nix (1)
1-13: Add explicit copyright/attribution for nixpkgs copy.This file appears to be a verbatim copy from nixpkgs; add a copyright/attribution comment to stay compliant.
🔧 Proposed fix
{ lib, stdenv, rust_1_89, fetchFromGitHub, buildPackages, cmake, installShellFiles, nix-update-script, enableShared ? !stdenv.hostPlatform.isStatic, enableStatic ? stdenv.hostPlatform.isStatic, }: +# Derived from nixpkgs: pkgs/development/wasmtime/default.nix +# Please retain upstream copyright and license notices. rust_1_89.packages.stable.rustPlatform.buildRustPackage (finalAttrs: {
🧹 Nitpick comments (2)
src/libstore/unix/build/wasi-derivation-builder.cc (2)
5-18: Extract shared helpers to a common location.Both
unwrap()andstring2span()are duplicated fromsrc/libexpr/primops/wasm.cc(as noted by the FIXME comments). Consider extracting these to a shared header (e.g.,nix/util/wasm-utils.hh) to avoid maintenance burden and potential divergence.
61-71: Consider handling WASM trap errors more gracefully.If
fun->call()fails (e.g., the WASM module traps),unwrap()will throw anError. This exception will be caught byhandleChildException()in the parent'srunChild()wrapper, but the error message from a WASM trap may be cryptic. Consider wrapping the call with additional context:auto result = fun->call(wasmStore.context(), {}); if (!result) throw Error("WASM module '%s' failed: %s", drv.builder, result.err().message());Additionally,
_exit(0)is called unconditionally after successful execution, which is correct for the child process flow.
| NixWasmInstance(EvalState & _state, ref<NixWasmInstancePre> _pre) | ||
| : state(_state) | ||
| , pre(_pre) | ||
| , wasmStore(pre->engine) | ||
| , wasmCtx(wasmStore) | ||
| , instance(unwrap(pre->instancePre.instantiate(wasmCtx))) | ||
| , memory_(std::get<Memory>(*instance.get(wasmCtx, "memory"))) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cd /tmp && find . -name "wasm.cc" -o -name "*.h" | head -20Repository: DeterminateSystems/nix-src
Length of output: 52
🏁 Script executed:
find / -name "wasm.cc" -type f 2>/dev/null | head -5Repository: DeterminateSystems/nix-src
Length of output: 116
🏁 Script executed:
git ls-files | grep -i wasmRepository: DeterminateSystems/nix-src
Length of output: 120
🏁 Script executed:
cat -n /home/jailuser/git/src/libexpr/primops/wasm.cc | head -150Repository: DeterminateSystems/nix-src
Length of output: 4898
🏁 Script executed:
cat -n /home/jailuser/git/src/libexpr/primops/wasm.cc | sed -n '145,160p'Repository: DeterminateSystems/nix-src
Length of output: 773
🏁 Script executed:
cat -n /home/jailuser/git/src/libexpr/primops/wasm.cc | sed -n '88,104p'Repository: DeterminateSystems/nix-src
Length of output: 625
Add error handling for memory export validation.
The current code unsafely assumes instance.get(wasmCtx, "memory") returns a valid Memory export without checking. This will throw std::bad_variant_access if the export is missing or has a different type. The getFunction() method (lines 145-154) demonstrates the correct pattern already used in this file—validate existence first, then use std::get_if for safe type checking.
Fix
- , memory_(std::get<Memory>(*instance.get(wasmCtx, "memory")))
+ , memory_(([&]() -> Memory {
+ auto ext = instance.get(wasmCtx, "memory");
+ if (!ext)
+ throw Error("WASM module '%s' does not export memory", pre->wasmPath);
+ auto mem = std::get_if<Memory>(&*ext);
+ if (!mem)
+ throw Error("export 'memory' of WASM module '%s' is not a memory", pre->wasmPath);
+ return *mem;
+ })())📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| NixWasmInstance(EvalState & _state, ref<NixWasmInstancePre> _pre) | |
| : state(_state) | |
| , pre(_pre) | |
| , wasmStore(pre->engine) | |
| , wasmCtx(wasmStore) | |
| , instance(unwrap(pre->instancePre.instantiate(wasmCtx))) | |
| , memory_(std::get<Memory>(*instance.get(wasmCtx, "memory"))) | |
| { | |
| NixWasmInstance(EvalState & _state, ref<NixWasmInstancePre> _pre) | |
| : state(_state) | |
| , pre(_pre) | |
| , wasmStore(pre->engine) | |
| , wasmCtx(wasmStore) | |
| , instance(unwrap(pre->instancePre.instantiate(wasmCtx))) | |
| , memory_(([&]() -> Memory { | |
| auto ext = instance.get(wasmCtx, "memory"); | |
| if (!ext) | |
| throw Error("WASM module '%s' does not export memory", pre->wasmPath); | |
| auto mem = std::get_if<Memory>(&*ext); | |
| if (!mem) | |
| throw Error("export 'memory' of WASM module '%s' is not a memory", pre->wasmPath); | |
| return *mem; | |
| })()) | |
| { |
🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 120 - 127, The constructor
NixWasmInstance currently assumes instance.get(wasmCtx, "memory") yields a
Memory and directly uses std::get, which can throw std::bad_variant_access;
replace this with the same safe pattern used in getFunction(): call
instance.get(wasmCtx, "memory"), check the optional export exists, then use
std::get_if<Memory> on the contained Export to validate the type, and if missing
or wrong type throw a clear runtime_error (or use unwrap) with a descriptive
message; set memory_ only after successful validation.
| auto memory() | ||
| { | ||
| return memory_.data(wasmCtx); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, check the file exists and get its size
wc -l src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 101
🏁 Script executed:
# Read the context around lines 162-165 to see the memory() function and surrounding code
sed -n '150,200p' src/libexpr/primops/wasm.cc | cat -nRepository: DeterminateSystems/nix-src
Length of output: 1918
🏁 Script executed:
# Search for all instances of memory().subspan in the file
rg 'memory\(\)\.subspan' src/libexpr/primops/wasm.cc -nRepository: DeterminateSystems/nix-src
Length of output: 744
🏁 Script executed:
# Look for any existing bounds checking or validation in the file
rg 'bounds|check|size\(\)|out.of.bounds' src/libexpr/primops/wasm.cc -n -iRepository: DeterminateSystems/nix-src
Length of output: 915
🏁 Script executed:
# Get the full context of the subspan helper and surrounding functions
sed -n '35,100p' src/libexpr/primops/wasm.cc | cat -nRepository: DeterminateSystems/nix-src
Length of output: 2353
🏁 Script executed:
# Look for the template definition of subspan
rg 'template.*subspan|subspan.*template' src/libexpr/primops/wasm.cc -A 5 -B 2Repository: DeterminateSystems/nix-src
Length of output: 52
🏁 Script executed:
# Get context around the unchecked calls to understand their safety
sed -n '160,250p' src/libexpr/primops/wasm.cc | cat -nRepository: DeterminateSystems/nix-src
Length of output: 3206
🏁 Script executed:
# Check if there's any bounds validation in the calling WASM code or caller context
rg 'ptr.*len|memory.*check|validate' src/libexpr/primops/wasm.cc -i -B 2 -A 2Repository: DeterminateSystems/nix-src
Length of output: 2681
🏁 Script executed:
# Verify memory_.data() signature to understand what it returns
rg 'memory_\.' src/libexpr/primops/wasm.cc -B 3 -A 1 | head -30Repository: DeterminateSystems/nix-src
Length of output: 138
🏁 Script executed:
# Check if there are any existing overflow checks or safe access patterns elsewhere in the code
rg 'uint32_t.*ptr.*uint32_t.*len' src/libexpr/primops/wasm.cc | wc -lRepository: DeterminateSystems/nix-src
Length of output: 71
🏁 Script executed:
# Check the actual span subspan method to understand its signature
rg 'std::span|\.subspan' src/libexpr/primops/wasm.cc | head -20Repository: DeterminateSystems/nix-src
Length of output: 1151
Add bounds-checked access for WASM memory pointers.
WASM supplies untrusted ptr/len values via seven functions: panic(), warn(), make_string(), make_list(), make_attrset(), copy_attrname(), and call_function(). These functions use memory().subspan(ptr, len) without runtime bounds validation, causing out-of-bounds access and undefined behavior in release builds. The custom subspan() template helper (line 51) relies on assert() which is disabled at runtime, providing no protection. Add a checked helper function that validates ptr + len ≤ memory.size() and apply it to all memory access sites.
Suggested helper implementation
auto memory()
{
return memory_.data(wasmCtx);
}
+
+ std::span<uint8_t> checkedSubspan(uint32_t ptr, uint32_t len)
+ {
+ auto mem = memory();
+ uint64_t end = (uint64_t) ptr + (uint64_t) len;
+ if (end > mem.size())
+ throw Error("WASM memory out of bounds: ptr=%u len=%u size=%zu", ptr, len, mem.size());
+ return mem.subspan(ptr, len);
+ }Apply to all nine memory access locations, including reads and writes at lines 169, 178, 226, 234, 257, 275, 322, 347, and 358.
🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 162 - 165, The code uses
memory().subspan(ptr, len) without runtime bounds checks, which allows OOB
access when WASM passes untrusted ptr/len; add a checked helper (e.g., a
function check_subspan(ptr, len) that fetches memory_.data(wasmCtx), verifies
ptr + len <= memory.size() and throws or returns an error/empty optional on
violation) and replace all direct memory().subspan(...) usages in the WASM entry
points (panic(), warn(), make_string(), make_list(), make_attrset(),
copy_attrname(), call_function() and any other places referencing
memory().subspan) to call this helper before accessing data so reads/writes are
guarded and undefined behavior is prevented. Ensure the helper uses the same
wasmCtx/memory_ access path (memory_.data(wasmCtx)) and consistently handles
errors (e.g., return an expected/optional or throw) so callers perform no
unchecked subspan operations.
| uint32_t get_type(ValueId valueId) | ||
| { | ||
| auto & value = *values.at(valueId); | ||
| state.forceValue(value, noPos); | ||
| auto t = value.type(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file structure and the specific lines mentioned
head -200 src/libexpr/primops/wasm.cc | tail -50Repository: DeterminateSystems/nix-src
Length of output: 1516
🏁 Script executed:
# Get full context around lines 182-186
sed -n '170,200p' src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 961
🏁 Script executed:
# Find all usages of values.at() in the file to understand scope
rg "values\.at\(" src/libexpr/primops/wasm.cc -nRepository: DeterminateSystems/nix-src
Length of output: 1001
🏁 Script executed:
# Check error handling patterns in the file (how Error is thrown)
rg "throw Error\(" src/libexpr/primops/wasm.cc -n -A 2 | head -40Repository: DeterminateSystems/nix-src
Length of output: 652
🏁 Script executed:
# Look at the class definition to understand values container
rg -A 20 "struct.*Evaluator\|class.*Evaluator" src/libexpr/primops/wasm.cc | head -40Repository: DeterminateSystems/nix-src
Length of output: 52
🏁 Script executed:
# Find the values container declaration
rg "std::vector.*values\|values.*std::vector" src/libexpr/primops/wasm.cc -B 2 -A 2Repository: DeterminateSystems/nix-src
Length of output: 52
🏁 Script executed:
# Check if there's a struct/class definition showing values member
sed -n '120,180p' src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 1794
🏁 Script executed:
# Look for any exception handlers that might catch std::out_of_range
rg "catch.*out_of_range|out_of_range" src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 52
🏁 Script executed:
# Check if std::out_of_range is caught anywhere in the broader codebase
rg "catch.*std::out_of_range|catch.*out_of_range" . -t cc -t h | head -20Repository: DeterminateSystems/nix-src
Length of output: 96
🏁 Script executed:
# Find the full class/struct definition of NixWasmInstance to see values declaration
sed -n '100,170p' src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 1956
🏁 Script executed:
# Search for the values member in the private section
rg "private:|std::vector.*values" src/libexpr/primops/wasm.cc -B 2 -A 5Repository: DeterminateSystems/nix-src
Length of output: 52
🏁 Script executed:
# Check the entire file structure - look for class definition
ast-grep --pattern 'struct $$$' src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 16875
🏁 Script executed:
# Just look at the full file to understand its structure
wc -l src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 101
🏁 Script executed:
# Read the beginning of the file to see the full class definition
head -150 src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 3848
🏁 Script executed:
# Find ValueVector definition
rg "using ValueVector|typedef.*ValueVector" src/libexpr/Repository: DeterminateSystems/nix-src
Length of output: 333
🏁 Script executed:
# Also check the includes to understand the type better
head -10 src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 213
Add bounds checking for WASM-supplied ValueId values.
WASM functions can pass arbitrary ValueId values to methods like get_type(), get_int(), etc. The current code uses values.at(valueId) which throws std::out_of_range (a non-Error exception) on invalid IDs. This is inconsistent with the codebase's uniform use of Error for exception handling. Add a bounds-checking helper that throws Error, and apply it throughout (at least 11 usages across get_type, get_int, get_float, copy_string, get_bool, make_list, copy_list, make_attrset, copy_attrset, copy_attrname, and call_function).
🔧 Proposed helper + example usage
+ Value * requireValue(ValueId valueId)
+ {
+ if ((size_t) valueId >= values.size())
+ throw Error("invalid ValueId %u (size=%zu)", valueId, values.size());
+ return values[valueId];
+ }Example replacement (apply throughout):
- auto & value = *values.at(valueId);
+ auto & value = *requireValue(valueId);🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 182 - 186, Add a bounds-check
helper (e.g., validateValueId or checkValueId) that checks the supplied ValueId
against values.size() and throws the codebase Error type on out-of-range IDs,
then replace direct uses of values.at(valueId) with a call to that helper
followed by safe indexed access (values[valueId]) in all WASM-exposed accessors:
get_type, get_int, get_float, copy_string, get_bool, make_list, copy_list,
make_attrset, copy_attrset, copy_attrname, and call_function so WASM-supplied
IDs produce a uniform Error exception instead of std::out_of_range.
| template<typename R, typename... Args> | ||
| static void regFun(Linker & linker, std::string_view name, R (NixWasmInstance::*f)(Args...)) | ||
| { | ||
| unwrap(linker.func_wrap("env", name, [f](Caller caller, Args... args) -> Result<R, Trap> { | ||
| try { | ||
| auto instance = std::any_cast<NixWasmInstance *>(caller.context().get_data()); | ||
| return (*instance.*f)(args...); | ||
| } catch (Error & e) { | ||
| return Trap(e.what()); | ||
| } | ||
| })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cd /tmp && git ls-files | head -20Repository: DeterminateSystems/nix-src
Length of output: 138
🏁 Script executed:
cat -n src/libexpr/primops/wasm.cc | sed -n '360,390p'Repository: DeterminateSystems/nix-src
Length of output: 1329
🏁 Script executed:
cat -n src/libexpr/primops/wasm.cc | sed -n '200,368p'Repository: DeterminateSystems/nix-src
Length of output: 6079
Catch non-Error exceptions in host callbacks.
std::any_cast and values.at() can throw std::exception subclasses (e.g., std::bad_any_cast, std::out_of_range), which would escape uncaught and potentially terminate the host. Add handlers for std::exception and unknown exceptions.
🔧 Proposed fix
unwrap(linker.func_wrap("env", name, [f](Caller caller, Args... args) -> Result<R, Trap> {
try {
auto instance = std::any_cast<NixWasmInstance *>(caller.context().get_data());
return (*instance.*f)(args...);
} catch (Error & e) {
return Trap(e.what());
+ } catch (const std::exception & e) {
+ return Trap(e.what());
+ } catch (...) {
+ return Trap("unknown host exception");
}
}));
}🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 369 - 379, The regFun host callback
only catches Error exceptions, so std::any_cast, container access, or other
operations can throw std::exception-derived or unknown exceptions and escape;
update the lambda in regFun (the callback created in func_wrap) to add
catch(const std::exception& e) that returns Trap(e.what()) and a final
catch(...) that returns a generic Trap with an explanatory message (e.g.,
"unknown exception in host callback") so all exceptions from the Caller/context
access and method invocation are converted to Traps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/libexpr/primops/wasm.cc`:
- Around line 447-449: The code directly indexes the WASM return with
instance.runFunction(...).at(0).i32() which can throw if there are no returns or
the type isn't i32; update the block around instance.runFunction,
instance.values, addValue and state.forceValue to first capture the returned
vector, verify its size is >= 1, then inspect the first Value's type (the API's
type/kind accessor) to ensure it's an i32 before calling .i32(); if either check
fails, convert the condition into a clear Error (throwing an Error with a
descriptive message) so the existing catch(Error &e) handles it rather than
letting std::out_of_range or wasmtime exceptions escape.
♻️ Duplicate comments (10)
src/libexpr/primops/wasm.cc (10)
48-53: Replaceassertwith a runtime bounds check insubspan.
assertis compiled out in release builds, so this can silently allow OOB access.🔧 Proposed fix
template<typename T> static std::span<T> subspan(std::span<uint8_t> s, size_t len) { - assert(s.size() >= len * sizeof(T)); + if (s.size() < len * sizeof(T)) + throw Error("buffer too small: expected at least %zu bytes, got %zu", + len * sizeof(T), s.size()); return std::span((T *) s.data(), len); }
120-127: Validate thememoryexport beforestd::get.
std::get<Memory>will throw if the export is missing or wrong type; translate that into a properError.🔧 Proposed fix
- , memory_(std::get<Memory>(*instance.get(wasmCtx, "memory"))) + , memory_(([&]() -> Memory { + auto ext = instance.get(wasmCtx, "memory"); + if (!ext) + throw Error("WASM module '%s' does not export memory", pre->wasmPath); + auto mem = std::get_if<Memory>(&*ext); + if (!mem) + throw Error("export 'memory' of WASM module '%s' is not a memory", pre->wasmPath); + return *mem; + })())
167-179: Add bounds-checked memory access forptr/len.WASM provides untrusted pointers; direct
memory().subspan(ptr, len)can go OOB in release builds.🔧 Proposed fix (helper + usage)
auto memory() { return memory_.data(wasmCtx); } + + std::span<uint8_t> checkedSubspan(uint32_t ptr, uint32_t len) + { + auto mem = memory(); + uint64_t end = (uint64_t) ptr + (uint64_t) len; + if (end > mem.size()) + throw Error("WASM memory out of bounds: ptr=%u len=%u size=%zu", ptr, len, mem.size()); + return mem.subspan(ptr, len); + }Then replace every
memory().subspan(ptr, len)withcheckedSubspan(ptr, len)(and thesubspan<T>wrappers on top of it).Also applies to: 223-238, 255-258, 269-276, 284-303, 309-323, 334-348, 352-359, 368-374
182-247: ValidateValueIdbounds and throwError.
values.at()throwsstd::out_of_range, which bypasses the project’sErrorhandling strategy.🔧 Proposed fix
+ Value * requireValue(ValueId valueId) + { + if ((size_t) valueId >= values.size()) + throw Error("invalid ValueId %u (size=%zu)", valueId, values.size()); + return values[valueId]; + }Apply:
- auto & value = *values.at(valueId); + auto & value = *requireValue(valueId);Also applies to: 255-266, 269-279, 284-304, 309-312, 334-338, 352-359, 373-379
255-264: Handle little‑endian conversion when reading IDs from WASM memory.WASM is little‑endian; reading
ValueId/struct fields directly is incorrect on big‑endian hosts.Also applies to: 269-279, 284-304, 309-328
324-328: Sort attributes to ensure deterministic order.The FIXME notes non‑determinism; attribute ordering should be stable across evaluations.
334-345: Replaceassertwith runtime checks incopy_attrname.Release builds drop the asserts, which can cause OOB or buffer mismatches.
🔧 Proposed fix
- assert((size_t) attrIdx < attrs.size()); + if ((size_t) attrIdx >= attrs.size()) + throw Error("attribute index %u out of bounds (size: %zu)", attrIdx, attrs.size()); std::string_view name = state.symbols[attrs[attrIdx].name]; - assert((size_t) len == name.size()); + if ((size_t) len != name.size()) + throw Error("buffer size mismatch: expected %zu bytes, got %u", name.size(), len);
389-398: Trap non‑Errorexceptions in host callbacks.
std::any_castand container access can throwstd::exceptionsubclasses. Convert them intoTraps.🔧 Proposed fix
unwrap(linker.func_wrap("env", name, [f](Caller caller, Args... args) -> Result<R, Trap> { try { auto instance = std::any_cast<NixWasmInstance *>(caller.context().get_data()); return (*instance.*f)(args...); } catch (Error & e) { return Trap(e.what()); + } catch (const std::exception & e) { + return Trap(e.what()); + } catch (...) { + return Trap("unknown host exception"); } }));
431-438: Make theinstancesPrecache thread‑safe (and bounded).Static
unordered_mapis unsafe under parallel evaluation and grows unbounded.
457-463: Renameprimop_fromTOMLto match its purpose.The identifier is misleading for a WASM primop.
🔧 Proposed fix
-static RegisterPrimOp primop_fromTOML( +static RegisterPrimOp primop_wasm(
| auto vRes = | ||
| instance.values.at(instance.runFunction(functionName, {(int32_t) instance.addValue(args[2])}).at(0).i32()); | ||
| state.forceValue(*vRes, pos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file structure and locate the problematic code
find . -name "wasm.cc" -type fRepository: DeterminateSystems/nix-src
Length of output: 99
🏁 Script executed:
# Read the specific lines in question with context
head -n 460 src/libexpr/primops/wasm.cc | tail -n 30Repository: DeterminateSystems/nix-src
Length of output: 1188
🏁 Script executed:
# Get broader context around these lines to understand the function
sed -n '420,470p' src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 1832
🏁 Script executed:
# Search for the definition of runFunction to understand what it returns
rg "runFunction" src/libexpr/primops/wasm.cc -A 5 -B 2Repository: DeterminateSystems/nix-src
Length of output: 737
🏁 Script executed:
# Check for any validation of WASM return values in the file
rg "at\(0\)" src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 189
🏁 Script executed:
# Search for similar patterns where WASM returns are handled
rg "\.i32\(\)" src/libexpr/primops/wasm.cc -B 3 -A 1Repository: DeterminateSystems/nix-src
Length of output: 302
🏁 Script executed:
# Search for Val type definition
rg "struct Val|class Val" src/libexpr/primops/wasm.cc -A 10Repository: DeterminateSystems/nix-src
Length of output: 52
🏁 Script executed:
# Search for i32() method definition
rg "\.i32\(" src/libexpr/primops/wasm.cc -B 2 -A 2Repository: DeterminateSystems/nix-src
Length of output: 267
🏁 Script executed:
# Look for the Val class or struct definition more broadly
rg "i32\(\)" src/ --type cc -B 5 -A 5 | head -50Repository: DeterminateSystems/nix-src
Length of output: 96
🏁 Script executed:
# Check what unwrap does and if it validates
rg "unwrap" src/libexpr/primops/wasm.cc -B 2 -A 3Repository: DeterminateSystems/nix-src
Length of output: 1090
🏁 Script executed:
# Look for Val definition in header files
find . -name "*.h" -o -name "*.hh" | xargs grep -l "struct Val\|class Val" 2>/dev/null | head -5Repository: DeterminateSystems/nix-src
Length of output: 294
🏁 Script executed:
# Look at the includes in wasm.cc
head -50 src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 1166
🏁 Script executed:
# Search for Val in all files
rg "struct Val\|class Val" . --type-list | head -20Repository: DeterminateSystems/nix-src
Length of output: 459
🏁 Script executed:
# Try searching for wasmtime Val
rg "i32()" . -B 3 -A 3 | head -50Repository: DeterminateSystems/nix-src
Length of output: 557
🏁 Script executed:
# Look for imports or using statements related to wasmtime
rg "using.*Val|#include.*wasmtime" src/libexpr/primops/wasm.cc -A 2 -B 2Repository: DeterminateSystems/nix-src
Length of output: 308
🏁 Script executed:
# Check if this is wasmtime-cpp library
rg "wasmtime\|wasm\\.h\|Val" src/libexpr/primops/wasm.cc | head -20Repository: DeterminateSystems/nix-src
Length of output: 52
🏁 Script executed:
# Search for other uses of .i32() in the repository to understand patterns
rg "\.i32()" . -B 3 -A 2Repository: DeterminateSystems/nix-src
Length of output: 501
🏁 Script executed:
# Check if there's any validation before accessing vector results
rg "runFunction" . -B 2 -A 5 | head -60Repository: DeterminateSystems/nix-src
Length of output: 1307
🏁 Script executed:
# Look for test files or examples that might show expected usage
find . -name "*wasm*test*" -o -name "*test*wasm*"Repository: DeterminateSystems/nix-src
Length of output: 52
🏁 Script executed:
# Check if there's any documentation or comments about return validation
rg "WASM|wasm" src/libexpr/primops/wasm.cc | grep -i "return\|result\|valid"Repository: DeterminateSystems/nix-src
Length of output: 639
🏁 Script executed:
# Search for any wrapper or validation around Val.i32() in the codebase
rg "Val.*i32|i32.*Val" . -B 3 -A 3Repository: DeterminateSystems/nix-src
Length of output: 557
🏁 Script executed:
# Look for the actual code handling values.at() to see if there's pattern matching
rg "values\.at" src/libexpr/primops/wasm.cc -B 5 -A 5Repository: DeterminateSystems/nix-src
Length of output: 4517
🏁 Script executed:
# Check what the .at() method on std::vector throws
# and whether the catch block would catch it
rg "catch.*std\|catch.*exception" src/libexpr/primops/wasm.cc -B 2 -A 2Repository: DeterminateSystems/nix-src
Length of output: 52
🏁 Script executed:
# Get the full context of the try-catch block to understand scope
sed -n '400,460p' src/libexpr/primops/wasm.ccRepository: DeterminateSystems/nix-src
Length of output: 2698
🏁 Script executed:
# Verify if Val.i32() can throw exceptions by looking at wasmtime documentation or patterns
# Check if there are any type checks before calling i32()
rg "i32()|type()" src/libexpr/primops/wasm.cc -B 2 -A 2Repository: DeterminateSystems/nix-src
Length of output: 1299
🏁 Script executed:
# Look for any validation pattern before accessing results
rg "empty()|size()" src/libexpr/primops/wasm.cc -B 3 -A 1Repository: DeterminateSystems/nix-src
Length of output: 1933
Add validation before accessing WASM function return value type and arity.
The code chains .at(0).i32() without validating that the function returns at least one value or that it's an i32 type. If either check fails, std::out_of_range or a wasmtime exception will be thrown—neither caught by the existing catch (Error & e) block. Check the return vector size and validate the type explicitly before accessing the result, converting any failures to a clear Error.
🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 447 - 449, The code directly
indexes the WASM return with instance.runFunction(...).at(0).i32() which can
throw if there are no returns or the type isn't i32; update the block around
instance.runFunction, instance.values, addValue and state.forceValue to first
capture the returned vector, verify its size is >= 1, then inspect the first
Value's type (the API's type/kind accessor) to ensure it's an i32 before calling
.i32(); if either check fails, convert the condition into a clear Error
(throwing an Error with a descriptive message) so the existing catch(Error &e)
handles it rather than letting std::out_of_range or wasmtime exceptions escape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/libexpr/primops/wasm.cc`:
- Around line 420-437: The read_file function can overflow the WASM memory check
and truncate large files: ensure you validate bounds using size_t arithmetic
before copying by checking that contents.size() <= len AND that ptr +
contents.size() does not wrap and is <= memory().size(); perform the
pointer+length check with a non-wrapping check (e.g., ptr <= memory().size() &&
contents.size() <= memory().size() - ptr) and use a safe min to copy only up to
the provided len, and handle/return sizes > UINT32_MAX explicitly (either clamp
to UINT32_MAX or return an error) so the uint32_t return and memcpy cannot
silently overflow or truncate. Use the function name read_file, the
pathValue/path.readFile() result (contents), and memory() when locating the
changes.
♻️ Duplicate comments (2)
src/libexpr/primops/wasm.cc (2)
496-504: Verify WASM function return value handling.The chained access
instance.runFunction(...).at(0).i32()at line 502 assumes the WASM function returns exactly one i32 value. This was flagged in past reviews. Additionally, the FIXME at line 498 about using the "start" function represents incomplete functionality.Consider adding explicit validation:
#!/bin/bash # Check if there's any documentation or tests showing expected WASM function signatures rg -n "nix_wasm_init_v1|runFunction" --type=cpp -C 3
511-518: Rename misleading variable name.The variable
primop_fromTOMLshould beprimop_wasmto match the registered primop name. This was flagged in past reviews.
🧹 Nitpick comments (2)
src/libexpr/primops/wasm.cc (2)
94-104: Consider refactoring statement expression for portability.The constructor uses a GNU C++ statement expression
({...})(lines 97-101) in the member initializer list, which is a non-standard extension. While GCC and Clang support it, this reduces portability.♻️ Refactor to use a static factory or helper method
+private: + static InstancePre createInstancePre(const SourcePath& wasmPath, Engine& engine) + { + Linker linker(engine); + regFuns(linker); + return unwrap(instantiate_pre(linker, unwrap(Module::compile(engine, string2span(wasmPath.readFile()))))); + } + +public: NixWasmInstancePre(SourcePath _wasmPath) : engine(getEngine()) , wasmPath(_wasmPath) - , instancePre(({ - Linker linker(engine); - regFuns(linker); - unwrap(instantiate_pre(linker, unwrap(Module::compile(engine, string2span(wasmPath.readFile()))))); - })) + , instancePre(createInstancePre(_wasmPath, engine)) { }
182-197: Type mismatch in error-throwing lambda.The lambda at line 196 returns
intbut is used in a context expectinguint32_t. While this works due to implicit conversion, it's clearer to match the expected type.- : []() -> int { throw Error("unsupported type"); }(); + : []() -> uint32_t { throw Error("unsupported type"); }();
| uint32_t read_file(ValueId pathId, uint32_t ptr, uint32_t len) | ||
| { | ||
| auto & pathValue = *values.at(pathId); | ||
| state.forceValue(pathValue, noPos); | ||
| if (pathValue.type() != nPath) | ||
| throw Error("read_file expects a path value"); | ||
|
|
||
| auto path = pathValue.path(); | ||
| auto contents = path.readFile(); | ||
|
|
||
| // FIXME: this is an inefficient interface since it may cause the file to be read twice. | ||
| if (contents.size() <= len) { | ||
| auto buf = memory().subspan(ptr, len); | ||
| memcpy(buf.data(), contents.data(), contents.size()); | ||
| } | ||
|
|
||
| return contents.size(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider potential integer overflow in buffer size comparison.
When len is close to UINT32_MAX, the comparison contents.size() <= len could pass even if the actual copy would overflow the WASM linear memory. The memory bounds checking issue flagged in past reviews applies here as well.
Additionally, returning the file size as uint32_t could silently truncate files larger than 4GB.
🔧 Suggested bounds validation
uint32_t read_file(ValueId pathId, uint32_t ptr, uint32_t len)
{
auto & pathValue = *values.at(pathId);
state.forceValue(pathValue, noPos);
if (pathValue.type() != nPath)
throw Error("read_file expects a path value");
auto path = pathValue.path();
auto contents = path.readFile();
+ if (contents.size() > std::numeric_limits<uint32_t>::max())
+ throw Error("file too large for WASM (size: %zu)", contents.size());
+
// FIXME: this is an inefficient interface since it may cause the file to be read twice.
if (contents.size() <= len) {
auto buf = memory().subspan(ptr, len);
memcpy(buf.data(), contents.data(), contents.size());
}
return contents.size();
}🤖 Prompt for AI Agents
In `@src/libexpr/primops/wasm.cc` around lines 420 - 437, The read_file function
can overflow the WASM memory check and truncate large files: ensure you validate
bounds using size_t arithmetic before copying by checking that contents.size()
<= len AND that ptr + contents.size() does not wrap and is <= memory().size();
perform the pointer+length check with a non-wrapping check (e.g., ptr <=
memory().size() && contents.size() <= memory().size() - ptr) and use a safe min
to copy only up to the provided len, and handle/return sizes > UINT32_MAX
explicitly (either clamp to UINT32_MAX or return an error) so the uint32_t
return and memcpy cannot silently overflow or truncate. Use the function name
read_file, the pathValue/path.readFile() result (contents), and memory() when
locating the changes.
Motivation
This PR adds support for extending Nix using WASM (specifically
wasmtime) in two ways:It adds a new builtin function,
builtins.wasm, that allows Nix functions to be written in any language that compiles to WASM. For example,The goal is to allow complex functionality (like YAML parsers) to be written in more convenient languages (like Rust) and executed faster, without having to resort to import-from-derivation.
The WASM function can lazily traverse its argument and can produce arbitrary Nix values (except functions). Each WASM function call is executed as a fresh instance of the WASM VM, ensuring determinism (i.e. you can't use the WASM memory to pass state between function calls).
See Initial bindings/examples for writing Nix functions in Rust nix-wasm-rust#1 for examples of Nix functions written in Rust.
It adds a new system type,
wasm32-wasip1, that allows for system-independent derivations. For example:Example here: https://github.com/DeterminateSystems/nix-wasi-rust
Context
Summary by CodeRabbit
New Features
Packaging
Build / Integration
Experimental Features
✏️ Tip: You can customize this high-level summary in your review settings.