-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
Proposal: Edge Label Interning and Hybrid Edge Index
This issue proposes a set of functional changes to improve performance and memory efficiency in sodg.rs without altering the external API.
- 1. Edge Label Interning
Introduce a new module labels.rs with a simple label interner.
Types
-
type LabelId = u32 -
LabelInternerwith methods:fn get_or_intern(&mut self, s: &str) -> LabelIdfn get(&self, s: &str) -> Option<LabelId>fn resolve(&self, id: LabelId) -> Option<&str>
Details
- Backed by two
HashMaps:String -> LabelIdandLabelId -> String. - Start IDs from
1. Reserve0for "not found". - No thread-safety required; interner belongs to
Sodg. - All edges should store
LabelIdinstead ofString.
- 2. Hybrid Edge Index per Vertex
Introduce a new module edge_index.rs.
use micromap::Map as MicroMap;
pub enum EdgeIndex {
Small(MicroMap<LabelId, u32>),
Large(std::collections::HashMap<LabelId, u32>),
}Constant
SMALL_THRESHOLD: usize = 32
Methods
-
new() -> Self(starts asSmall) -
len(&self) -> usize -
get(&self, label: LabelId) -> Option<u32> -
insert(&mut self, label: LabelId, to: u32)- Use
Smalluntil length exceeds threshold, then migrate toLarge.
- Use
-
remove(&mut self, label: LabelId) -> Option<u32>
Integration into Vertex
- Add field
index: EdgeIndex. - Keep
Vec<Edge>for iteration/serialization. - Update
Edgeto storelabel: LabelIdinstead ofString. - Ensure
edgesandindexstay in sync.
- 3. Integration into
Sodg
- Add
labels: LabelInternerfield toSodg. - On edge creation (
bind, etc.), convert&strtoLabelIdviaget_or_intern. - On lookup (
kid, etc.), convert&strtoLabelIdviaget. - Update locator parsing to operate on
&strslices directly (e.g.,split('.')), avoiding temporaryVec<String>. ResolveLabelIdper segment during traversal.
- 4. Reducing Data Allocations
-
For data type like
Hex, replace heavy clones withArc<[u8]>inside an enum variant.- Clones become O(1).
- Serialization must remain intact.
-
In
gc:- Avoid cloning full
Vertex. - Collect
child_ids: Vec<u32>from an immutable borrow, then mutate graph afterward.
- Avoid cloning full
- 5. Tests
-
Update tests: internal edges use
LabelId, but external API still accepts&str. -
Add new tests:
kid()lookup on vertices with degree 1, 31, and 33 (validate Small→Large transition).- Edge removal in both Small and Large variants.
find()with multi-segment paths.- Serialization/deserialization with
LabelInterner(ensure indices rebuild correctly).
Expected Outcomes
- Faster edge lookups on small-degree vertices via
micromap. - Reduced memory overhead from interning labels.
- O(1) clones for vertex data payloads.
- No breaking changes to the public API.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels