Skip to content

reactive: zero-alloc data structures and wave-based API#8277

Draft
cristianoc wants to merge 1 commit intomasterfrom
reactive-noalloc-current-mechanism
Draft

reactive: zero-alloc data structures and wave-based API#8277
cristianoc wants to merge 1 commit intomasterfrom
reactive-noalloc-current-mechanism

Conversation

@cristianoc
Copy link
Collaborator

@cristianoc cristianoc commented Mar 5, 2026

reactive: zero-alloc data structures and wave-based API

Motivation

The reactive incremental engine maintains hash maps, sets, and nested containers (map-of-sets, map-of-maps) that are updated on every file change. Under the old Hashtbl-based implementation, each lookup returned a boxed option, each iteration allocated a closure, and inner containers in structures like contributions: (k2, (k1, v2) Hashtbl.t) Hashtbl.t were created and abandoned on key churn. In a replay of 56 sequential commits on a real codebase (hyperindex), these micro-allocations dominated GC pressure in steady state.

This PR eliminates all steady-state allocation in the reactive engine's core data path.

Design

1. ReactiveHash.Map / ReactiveHash.Set (434 + 63 LOC)

Custom open-addressing hash tables vendored from Hachis (François Pottier, Inria Paris), adapted for the reactive engine's usage patterns. Key properties:

  • Linear probing with power-of-2 capacity, void/tomb sentinels, and 82% max occupancy threshold. After tables reach steady-state capacity, clear + replace cycles allocate zero heap words.
  • Type-erased storage via Obj.t arrays — a single concrete table type backs both Map and Set, avoiding functor overhead.
  • iter_with / exists_with — iteration with an extra context argument, avoiding closure allocation on every call. This is the critical pattern: instead of Map.iter (fun k v -> ... captured_state ...), callers write Map.iter_with f state map where f is a module-level function.
  • find_maybe — returns ReactiveMaybe.t instead of option, eliminating the Some box on every lookup.

2. ReactiveMaybe.t (17 LOC)

An unboxed optional: none is a physically unique sentinel, some v is Obj.repr v (zero allocation). is_some / unsafe_get are inline comparisons. This replaces option at every map lookup boundary and in wave payloads for remove-vs-set discrimination.

3. ReactiveWave.t (31 LOC)

A fixed-capacity pair of Obj.t arrays (keys + values) with an integer length counter. Waves replace the delta variant type (Set | Remove | Batch of (k * v option) list) that previously allocated a list cons cell per entry per propagation step. clear just resets the length to 0. Waves are allocated once at combinator creation time and reused across all subsequent processing cycles.

4. ReactivePoolMapSet (107 LOC) and ReactivePoolMapMap (102 LOC)

Pooled container-of-containers with deterministic inner-container recycling.

Problem: structures like pred_map: (k, k Set) Map and contributions: (k2, (k1, v2) Map) Map exhibit key churn — outer keys appear and disappear across incremental updates. Under the old design, each new outer key allocated a fresh inner container, and removal just dropped it for GC.

Solution: both modules maintain an internal free-list (stack of cleared inner containers). The API forces callers through lifecycle-aware operations:

  • add / replace — reuses a pooled inner container on first access to a new key, or allocates if the pool is empty (pool_miss_create event).
  • drain_key / drain_outer — iterates the inner container, then clears and returns it to the pool.
  • remove_from_set_and_recycle_if_empty / remove_from_inner_and_recycle_if_empty — removes one element; if the inner container becomes empty, clears and recycles it.

After warmup (first request in replay), the pool satisfies 100% of inner-container demands — zero allocation in steady state. Measured on hyperindex replay: 31,963 initial pool misses for sets, then 138 misses across the remaining 55 requests combined.

5. Zero-alloc combinators

All four combinators (flatMap, join, union, fixpoint) were rewritten to:

  • Use iter_with with module-level callback functions instead of closures.
  • Store scratch state in pre-allocated mutable fields on the combinator's record (affected, scratch, merge_acc, etc.).
  • Accept and emit ReactiveWave.t instead of delta lists.
  • Use ReactivePoolMapSet for provenance tracking and ReactivePoolMapMap for contribution aggregation (flatMap, join).

The fixpoint combinator additionally migrated pred_map from Map<k, Map<k, unit>> to ReactivePoolMapSet (recognizing it as semantically a map-of-set), and has_live_predecessor uses the new Set.exists_with for early-exit iteration.

6. ReactiveAllocTrace (80 LOC)

Two-level tracing controlled by RESCRIPT_REACTIVE_ALLOC_TRACE:

  • Level 1 (=1): logs allocation events only (map/set create, table resize, pool miss, pool resize).
  • Level 2 (=2): also logs operational events (drain, remove-recycle) for full lifecycle analysis.

Events are written as single-line strings to a file descriptor, with event kinds split into alloc_event_kind and op_event_kind types. The emit_alloc_kind / emit_op_kind functions check the level before writing, so level-1 tracing has zero overhead for operational events.

Migration summary

Structure Before After
All hash maps/sets Hashtbl.t ReactiveHash.Map.t / Set.t
Map lookups find_optoption find_maybeReactiveMaybe.t
Delta propagation Set | Remove | Batch of list ReactiveWave.t
flatMap.provenance ad-hoc Map<k1, k2 list> ReactivePoolMapSet
flatMap.contributions Map<k2, Map<k1,v2>> with manual get_contributions ReactivePoolMapMap
join.contributions same pattern ReactivePoolMapMap
fixpoint.pred_map Map<k, Map<k, unit>> ReactivePoolMapSet
Combinator iteration closure per call iter_with + module-level functions

Testing

53 tests across 5 test modules. The 20 allocation tests (AllocTest.ml, 642 LOC) measure Gc.stat().minor_words across warmup + measured iterations and assert:

  • words/iter = 0 for fixpoint, flatMap, join, union in steady state.
  • pool_miss_delta = 0 for both PoolMapSet and PoolMapMap churn patterns after warmup.
  • Functional correctness of drain/recycle cycles (outer cardinal, empty-inner counts).

- Zero-alloc fixpoint, flatMap, join, union, source, scheduler
- ReactiveHash.Map/Set with ReactiveMaybe for zero-alloc lookups
- ReactivePoolMapSet for zero-alloc map-of-sets with set recycling
- ReactivePoolMapMap for zero-alloc map-of-maps with inner-map recycling
- ReactiveAllocTrace with two-level tracing (alloc-only vs alloc+ops)
- Wave-based emit API with ReactiveMaybe
- Comprehensive allocation tests

Signed-Off-By: Cristiano Calcagno <[email protected]>
@cristianoc cristianoc changed the title reactive: zero-alloc current mechanism (squashed) reactive: zero-alloc data structures and wave-based API Mar 5, 2026
@pkg-pr-new
Copy link

pkg-pr-new bot commented Mar 5, 2026

Open in StackBlitz

rescript

npm i https://pkg.pr.new/rescript@8277

@rescript/darwin-arm64

npm i https://pkg.pr.new/@rescript/darwin-arm64@8277

@rescript/darwin-x64

npm i https://pkg.pr.new/@rescript/darwin-x64@8277

@rescript/linux-arm64

npm i https://pkg.pr.new/@rescript/linux-arm64@8277

@rescript/linux-x64

npm i https://pkg.pr.new/@rescript/linux-x64@8277

@rescript/runtime

npm i https://pkg.pr.new/@rescript/runtime@8277

@rescript/win32-x64

npm i https://pkg.pr.new/@rescript/win32-x64@8277

commit: acec4fb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant