Production-ready WebAssembly support

#503 covers the initial investigation into WebAssembly. Due to funding limitations, that work was scoped to a 6-week effort. This issue is about the next phase: productionizing the current prototype.

> [!NOTE]
> ~~We are seeking ~~$20k~~ $12k US in funding to make this project possible. If you or your company could benefit from the portability and performance improvements of this work and could potentially help fund it, please get in touch with @pdubroy.~~

_This project is now fully funded through the generous support of @shopify!_ 🙏

## Background

The goal of the MVP was:

> The primary goal is to be able to produce a WebAssembly module (`.wasm`) from an existing Ohm grammar. The Wasm module would export a `match` function, which could be used just like the existing `Grammar.match` method.
>
> At minimum, we also need an API (or ABI) for walking the parse tree.

This was successfully achieved, and the experiment showed that the Wasm version has significantly better performance on some real-world grammars (e.g. >10x on a 10k ES5 source file).

## Goals

The next phase is to make the current prototype production-ready:

- Implement grammar features that are not supported in the MVP: parameterized rules, left recursion, missing primitive rules (`caseInsensitive`, `lower`, `upper`, `UnicodeLtmo`).
- Add implicit space skipping.
- Remove the input length limitation (support inputs >64k)
- (Maybe) support grammars with >256 rules.
- Implement error-handling.
- Implement a drop-in replacement for `toAST()` so at least some users could adopt the Wasm version without large-scale code changes.
- Make this available in a published NPM package (either a new major version of `ohm-js`, or in a separate package, etc. `@ohmjs/wasm`).
- Documentation.
 
## Non-goals

Eventual goals, but not included in this phase:
 
- Incremental parsing.
- Fully replacing the functionality of `pexprs-eval.js` with the WebAssembly version, and rewriting the semantics code to work with CSTs in Wasm linear memory.

## Estimate

- Parameterized rules: 1w
- Left recursion: 1w
- Missing primitive rules: 1w
- Space-skipping: 1.5w
- Input length: 1d
- Error-handling: 2w
- `toAST`: 1.5w
- Publish package: 1d
- Docs: 2d
- More testing/benchmarks: 1w

**Total:** 10w

Time-permitting: 
- Work on reducing memory usage and/or parsing performance:
  - Dynamic memory allocation for memo table (HAMTs?)
  - More sophisticated memoization strategies
  - Compile-time optimization for parsing expressions

Reduced scope (~7w):
- Leave out implicit space-skipping and left recursion.
- Minimal documentation
- Focus testing & benchmarking on some existing real-world grammars (e.g. Shopify's [liquid-html grammars](https://github.com/Shopify/theme-tools/blob/main/packages/liquid-html-parser/grammar/liquid-html.ohm)).



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Production-ready WebAssembly support #511

Background

Goals

Non-goals

Estimate

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Production-ready WebAssembly support #511

Description

Background

Goals

Non-goals

Estimate

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions