Reinforcement Learning RFC #69

l45k · 2025-10-01T09:14:49Z

This RFC proposes how Hypha can be improved in order to support Reinforcemnt Learning.

codecov · 2025-10-01T09:16:53Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

orlandohohmeier · 2025-10-01T18:13:23Z

rfc/2025-10-01_reinforcement_learning.md

+The second requirement will be satisfied by improving the Scheduler to redirect Worker requests for data to
+different RL Data Nodes. Thus, the Scheduler needs to balance latency between RL Data Nodes and Workers as
+well as sampling and processing speed.


I wonder whether we should model this not in the scheduler but via the different connector/bridge much like the stochastic wiring described in the SWARM learning paper. We already have a many reference and different selection strategies allowing us to point from one work to many data nodes and only need to extend this with a strategy that considers the connection and delivery speed (latency, bandwidth, generation) to optimally connect workers with data nodes.

I think this would be possible. However, I would like to avoid mixing concepts. We decided to go with DiLoCo and a centralized scheduler. If we now start to loosen this by introducing a form of decentralized scheduling, it will complicate things more than it will help.

Well, it be super interesting to benchmark one approach against the other no matter what we'll use as the standard moving forward.

I fully agree. But would rather have a working baseling and start improving from there on.

Yeah, let's start with the scheduler approach then

orlandohohmeier · 2025-10-13T10:29:12Z

We should probably do #80 first.

doc: reinforcement learning rfc

5b97873

This RFC proposes how Hypha can be improved in order to support Reinforcemnt Learning.

orlandohohmeier reviewed Oct 1, 2025

View reviewed changes

orlandohohmeier mentioned this pull request Oct 13, 2025

Feature: Add Reinforcement Learning Support #81

Open

juliangieseke force-pushed the main branch from 11e7a6c to fcd4d18 Compare November 6, 2025 14:28

juliangieseke changed the base branch from main to alpha November 6, 2025 14:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reinforcement Learning RFC #69

Reinforcement Learning RFC #69

Uh oh!

l45k commented Oct 1, 2025

Uh oh!

codecov bot commented Oct 1, 2025

Uh oh!

orlandohohmeier Oct 1, 2025

Uh oh!

l45k Oct 2, 2025

Uh oh!

orlandohohmeier Oct 2, 2025

Uh oh!

l45k Oct 2, 2025

Uh oh!

orlandohohmeier Oct 2, 2025

Uh oh!

orlandohohmeier commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Reinforcement Learning RFC #69

Are you sure you want to change the base?

Reinforcement Learning RFC #69

Uh oh!

Conversation

l45k commented Oct 1, 2025

Uh oh!

codecov bot commented Oct 1, 2025

Codecov Report

Uh oh!

orlandohohmeier Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

l45k Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

orlandohohmeier Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

l45k Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

orlandohohmeier Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

orlandohohmeier commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants