-
Notifications
You must be signed in to change notification settings - Fork 39
Description
Description
The current runtime message implementation uses JSON serialization as the canonical string representation for most message types (String(), MarshalJSON(), printing to stdout, etc.). This aligns well with the language design goal of strict separation between code and data and the guarantee that all runtime data is serializable.
However, the current approach introduces several performance, correctness, and maintainability concerns, especially as message sizes and output frequency grow.
This issue proposes reviewing and potentially redesigning the runtime formatting / serialization layer.
Current behavior (summary)
From the runtime implementation:
-
Most message types (
BoolMsg,IntMsg,FloatMsg,StringMsg,ListMsg,DictMsg,StructMsg) implement:String()→ often callsMarshalJSON()MarshalJSON()→ usually delegates toencoding/json
-
Composite types (
ListMsg,DictMsg,StructMsg) rely on:- Go reflection (
json.Marshal) - Temporary maps / slices
- Post-processing JSON strings with
strings.ReplaceAllto insert spaces
- Go reflection (
-
UnionMsg.String()manually formats JSON-like output usingfmt.Sprintf -
Printing and formatting are implicitly tied to JSON encoding, even for simple debug output
Identified issues
1. Performance overhead
encoding/jsonuses reflection and allocations for composite valuesMarshalJSON → []byte → stringcreates unnecessary allocations- Repeated
strings.ReplaceAllcauses additional passes over encoded data - Printing complex values in hot paths may become a bottleneck even when I/O is not
2. Inconsistent serialization paths
- Some types use
json.Marshal, others hand-format JSON (UnionMsg) - Spacing and formatting rules are not centralized
String()andMarshalJSON()semantics are tightly coupled but implemented differently across types
3. Determinism concerns
DictMsgandStructMsgrely on Go maps internally during marshaling- Key order is not guaranteed unless explicitly sorted
- This affects reproducibility, snapshot tests, and debugging
4. Semantic edge cases
- JSON loses type distinctions (e.g. int vs float vs uint in the future)
- NaN / ±Inf handling for
FloatMsgis undefined - Binary data would require additional conventions
- Round-tripping printed output back into the language is ambiguous
5. API & design rigidity
-
Formatting is implicitly JSON-only
-
No distinction between:
- fast debug printing
- canonical JSON serialization
- user-facing formatting
-
Hard to optimize or evolve without touching many message types
Why this matters
While JSON-as-default is a reasonable MVP decision, the runtime now:
- Pays full JSON encoding cost even when it may not be needed
- Mixes concerns (debug printing vs canonical serialization)
- Makes future optimizations harder due to scattered logic
Given that the runtime already has a closed set of message kinds (BoolMsg, IntMsg, ListMsg, StructMsg, UnionMsg, etc.), this is an opportunity to:
- Avoid reflection entirely
- Centralize formatting logic
- Make performance characteristics explicit and predictable
Proposed directions (non-exclusive)
Option A: Centralized non-reflective encoder
- Implement a custom encoder that operates directly on
Msgvariants - Stream output to
io.Writerinstead of allocating[]byte - Preserve JSON as the canonical format, but without
encoding/json
Option B: Separate concerns
String()→ fast, deterministic, debug-oriented representationMarshalJSON()/ToJSON()→ canonical JSON serialization- Keep JSON for interop, not for every print
Option C: Explicit formatting policy
-
Define a single formatting contract for:
- spacing
- ordering
- union encoding
-
Ensure determinism by sorting keys / fields explicitly
Acceptance criteria / next steps
-
Decide whether JSON should remain the default for
String() -
Measure current performance (benchmarks on nested messages)
-
Identify hot paths (printing, logging, REPL, tests)
-
Prototype either:
- a custom encoder, or
- a split between debug printing and JSON serialization
References
Relevant runtime code:
MsginterfaceListMsg.MarshalJSONDictMsg.MarshalJSONStructMsg.MarshalJSONUnionMsg.String
Notes
This issue is not about changing language semantics.
It is about making formatting explicit, deterministic, and efficient, while preserving the core design principle: all runtime data is serializable.