Skip to content

Contributor Guideline & Design Doc Collections[Current Status Quo & Future Plan] #11

@ashione

Description

@ashione

Main Modules


Reference & Design Doc

Modules of Ray Streaming:

  • CrossLang API
  • Runtime
  • Coordinator/Scheduler
  • Reliability/Fault Tolerance
  • Fusion Training
  • Advanced Functions

✅ : Finished
⚠️ : Move From Internal/Developing
🏃🏻‍♀️: Welcome to contribute(If you want to contribute features please let us know)

CrossLang API

X-Lang:

  • Java <-> Python ✅
  • Arrow ⚠️

API:

  • Source ✅
  • Union ✅
  • Sink ✅
  • Map ✅
  • FlatMap ✅
  • Reduce ✅
  • Join ⚠️
  • Window ⚠️

Popular Connectors:

  • Kafka 🏃🏻‍♀️
  • Mysql 🏃🏻‍♀️
  • etc.

Runtime

Transfer:

  • Low Latency RingBuffer Auto Flush ✅
  • RandomShuffle ✅
  • HashShuffle ✅
  • DynamicRebalance ⚠️
  • Colocated SharedMemory ⚠️
  • FlowControl ⚠️ (Without Empty Message): 04 Feb Empty Message

State Backend:

  • State backend common library ⚠️
  • Memory Backend ✅
  • Rocksdb Backend ⚠️
  • S3 Backend 🏃🏻‍♀️

Buffer Optimization:

  • BufferPool ⚠️ Part1 28 Mar
  • ElasticBuffer ⚠️

Coordinator/Scheduler

Coordinator:

  • Rewrite JobMaster in Python 🏃🏻‍♀️

Scheduler & AutoScale:

  • PlacementGroup-Pipeline First ⚠️
  • Pod-Wise-Random-Scheduler ✅
  • Random Scheduler ⚠️
  • United Distributed Controller ⚠️🏃🏻‍♀️
  • Rescale Adaption ⚠️🏃🏻‍♀️

Reliability/Fault Tolerance

Reliability:

  • Checkpoint Sync ✅
  • Checkpoint Async ⚠️🏃🏻‍♀️
  • At Least Once ✅
  • Exactly Once ⚠️
  • Exactly Same ⚠️

Fusion Training

Training:

  • CircularBuffer shared with tensorflow/pytorch reader ⚠️
  • Parameter Server Scheduler ⚠️
  • Evaluator Scheduler ⚠️
  • Parameter Server AutoPartition ⚠️

Advanced Functions

  • Distributed RPC over DAG ⚠️
  • Metrics & Profiling ⚠️

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions