A learning-oriented mini-distributed file-system written in Rust, focused on chunking, hashing, compression, replication and parity recovery—loosely inspired by the design ideas of the Google File System (GFS).
Inspirational parallels to GFS:
- Large immutable chunks minimise master interactions and make sequential I/O efficient.
- Master + chunkserver mindset (here: coordinator + nodes) keeps metadata centralised but datapath peer-to-peer.
- Replication & cheap hardware: tolerate node loss by re-replicating or rebuilding from parity.
Under active development (pre-v1)
- CLI scaffold via
clap
(put
,get
,node
,coord
) - Chunker: fixed-size splitter + SHA-256 hash table + unit tests
- In-memory catalog mapping
file → chunks → nodes
- Configurable replication factor (round-robin placement)
- LZ4 compression on each chunk
-
put
Operation - XOR parity per block-set; offline node recovery
- Download path involving checksum verification, decompression, & reassembly
- Benchmarks with
hyperfine
+flamegraphs
- Parallel transfer path (
threads
orTokio
) - Error handling of edge cases
- Usage examples & retrospective in README
- Stretch: gRPC streaming, consistent-hash ring, Reed–Solomon erasure coding
flowchart LR
%% Client side
subgraph CLIENT["Client Machine"]
APP[Application]
CLI[dfs-sim client / CLI]
APP --> CLI
end
%% Coordinator
COORD["Coordinator / Master
(in-memory catalog + replication policy)"]
%% Storage box (no title here)
subgraph STORAGE
direction TB
N1[Node-1]
N2[Node-2]
N3[Node-3]
MORE((...))
%% External storage label as its own node
L["Chunk Servers"]
style L fill:none,stroke:none
end
%% Control (dashed)
CLI -. lookup(file, chunk-idx) .-> COORD
COORD -. locations + chunk handles .-> CLI
N1 -. heartbeat + chunkstate .-> COORD
N2 -. heartbeat + chunkstate .-> COORD
N3 -. heartbeat + chunkstate .-> COORD
%% Data (solid)
CLI -->|stream chunk| N1
CLI -->|replica / parity| N2
CLI -->|replica / parity| N3
Solid arrows: data path.
Dashed arrows: metadata / control.
# three nodes
dfs-sim node --id n1 --listen 127.0.0.1:4001
dfs-sim node --id n2 --listen 127.0.0.1:4002
dfs-sim node --id n3 --listen 127.0.0.1:4003
# coordinator
dfs-sim coord --nodes 127.0.0.1:4001,127.0.0.1:4002,127.0.0.1:4003 \
--replication 2 --parity xor
# store & retrieve
dfs-sim put ./large.iso --name iso/ubuntu.iso
dfs-sim get iso/ubuntu.iso --out ./restore.iso
Bring one node down (Ctrl-C n1
) and repeat dfs-sim get …
to watch parity recovery in logs.
- PUT: CLI streams file ➜ Coordinator
- Coordinator splits into k MB chunks, computes SHA-256, compresses, picks replica + parity layout, then streams each chunk to selected nodes in parallel (round-robin for now).
- GET: Coordinator returns ordered chunk list; CLI concurrently pulls chunks, verifies hash, decompresses, reassembles. Missing chunks trigger parity rebuild on-the-fly.
- Node failure simulation: Bring a node down—catalog notices missed heart-beats, flips state, and read path fetches from replicas or kicks off XOR reconstruction.
- Each chunk is 64 MB. Large enough to amortize metadata, small enough for demo.
- Once written, chunks are append-only—mirrors GFS’s design for simpler consistency guarantees.
- Chunker reads file in 64 MB blocks and computes SHA-256 hash, which are stored in a custom data structure.
- Each chunk is streamed to R distinct nodes (
--replication R
). - Optional XOR parity block written to a further node set (
--parity xor
), allowing single-node loss recovery. - Coordinator tracks live nodes; missing replicas are lazily rebuilt in background.
- Namespace & placement: chooses nodes, maintains in-memory catalog (serialised to disk on exit).
- Heart-beat & health: nodes ping every 3 s; missed pings mark node offline.
- Recovery: scans catalog, schedules background
parity::rebuild()
tasks.
- Chunk bytes →
lz4_flex::compress
- Prepend 4-byte original-size header
- Store compressed buffer; on read, reverse.
Tokio
runtime (orstd::thread
+ channels) drives async transfers.- Chunk uploads are pipelined; hashes & compression executed in a scoped thread-pool so CPU work overlaps I/O.
- Shared state behind
Arc<Mutex<…>>
to exercise Rust’s ownership model.
Planned with hyperfine
(throughput) and cargo-flamegraph
(hot spots). Report will include:
- PUT/GET MB/s vs. replication factor
- Compression ratio vs. CPU cost
- Restore latency under node loss
- Ghemawat S., Gobioff H., Leung S-T. The Google File System. SOSP 2003.
- Project outline: dfs-simulator-project (pdf).
Contributions are welcome soon once the core scaffolding stabilises. Feel free to open issues for design discussion; pull-requests will be reviewed after an initial public alpha.
Released under the MIT License – see LICENSE
for details.