feat(ocap-kernel): Support multiple subclusters #530

sirtimid · 2025-06-02T10:49:29Z

Currently the kernel only supports a single subcluster at a time.
With this PR we enable users to create, retrieve, update and delete multiple subclusters, each with their own configuration. Also updated the control panel to support viewing and managing these subclusters through the interface.

Screen.Recording.2025-06-04.at.12.34.47.mov

grypez

Design LGTM. I summarize the key decisions below.

enforced one-to-many subcluster-to-vat relation
reloadSubcluster reloads all of the configured vats, even if they were terminated
special rogue vat status for vats not in subclusters

Code looks good overall. I make some small recommendations.

packages/extension/src/ui/components/LaunchVat.test.tsx

packages/ocap-kernel/src/store/methods/subclusters.test.ts

packages/ocap-kernel/src/store/methods/subclusters.ts

packages/ocap-kernel/src/store/methods/subclusters.test.ts

sirtimid · 2025-06-03T17:39:54Z

Design LGTM. I summarize the key decisions below.

enforced one-to-many subcluster-to-vat relation

reloadSubcluster reloads all of the configured vats, even if they were terminated

special rogue vat status for vats not in subclusters

Code looks good overall. I make some small recommendations.

@grypez I just want to point out that in the Kernel we have two distinct operations for handling vats/subclusters: restart and reload. The key difference is that restart is a "pause and resume" operation, it stops the vat but preserves its state and identity, then immediately starts it again with the same configuration. On the other hand, reload is a "clean slate" operation, it fully terminates the vat/subcluster, creates a new instance with a fresh state and assigns a new ID. When you restart a vat it's like putting it to sleep and waking it up, while reload is like creating a brand new instance from scratch. This is why reloadSubcluster loads all of the configured vats, even if they were terminated. I didn't create a restartSubcluster since we can run restartVat individually, but it is easy to implement if needed

FUDCo · 2025-06-03T22:11:22Z

special rogue vat status for vats not in subclusters

If I'm following this correctly (caveat: not entirely sure I am) I don't see the purpose at all in supporting a vat not in a subcluster. A lone vat is just a subcluster with a single vat in it, which I always expected to be an entirely normal thing. As I suggested above, the main reason for introducing the subcluster abstraction was to allow a vat's configuration to be coordinated with the configurations of other vats (e.g., because they are to be born in some kind of cooperative relationship with each other). Basically, it's a another layer of nested braces in the config file. Being able to do things like terminate a subcluster seems like a really useful leveraging of that concept (i.e., these things are together, so it's nice to provide a safe and convenient affordance for doing something to all of them collectively instead of having to futz with them individually), but it's not fundamental.

grypez · 2025-06-03T22:41:09Z

@grypez I just want to point out that in the Kernel we have two distinct operations for handling vats/subclusters: ... restart is a "pause and resume" operation, it stops the vat but preserves its state and identity, then immediately starts it again with the same configuration. On the other hand, reload is a "clean slate" operation, it fully terminates the vat/subcluster, creates a new instance with a fresh state and assigns a new ID. ... I didn't create a restartSubcluster since we can run restartVat individually, but it is easy to implement if needed

That makes sense, and explains why reloadSubcluster depends only on the configuration, not the state.

Addressed

grypez · 2025-06-03T23:21:12Z

If I'm following this correctly (caveat: not entirely sure I am) I don't see the purpose at all in supporting a vat not in a subcluster. A lone vat is just a subcluster with a single vat in it, which I always expected to be an entirely normal thing. ...

I support single vat subclusters pulling their own bootstraps. This is already possible.

grypez

LGTM

sirtimid · 2025-06-04T16:19:26Z

special rogue vat status for vats not in subclusters

If I'm following this correctly (caveat: not entirely sure I am) I don't see the purpose at all in supporting a vat not in a subcluster. A lone vat is just a subcluster with a single vat in it, which I always expected to be an entirely normal thing. As I suggested above, the main reason for introducing the subcluster abstraction was to allow a vat's configuration to be coordinated with the configurations of other vats (e.g., because they are to be born in some kind of cooperative relationship with each other). Basically, it's a another layer of nested braces in the config file. Being able to do things like terminate a subcluster seems like a really useful leveraging of that concept (i.e., these things are together, so it's nice to provide a safe and convenient affordance for doing something to all of them collectively instead of having to futz with them individually), but it's not fundamental.

You're absolutely right, treating a lone vat as simply a subcluster with one vat makes a lot of sense, and I agree it would simplify the model overall. What you're suggesting would mean removing support for launching vats outside of a subcluster entirely (i.e., removing kernel.launchVat), which I think is the right direction long-term.

That said, since my current task was scoped specifically to adding support for multiple subclusters, I’ve kept the existing behavior for now to avoid widening the scope too much, especially given that a lot of our integration tests in @ocap/kernel-test rely on launching standalone vats.

To make sure we follow up on this properly, I’ve created a new task to migrate the kernel to fully enforce subcluster-based vat launching. I'll pick that up next.

FUDCo

LGTM!

sirtimid requested a review from a team as a code owner June 2, 2025 10:49

grypez previously requested changes Jun 3, 2025

View reviewed changes

rekmarks linked an issue Jun 3, 2025 that may be closed by this pull request

Extend kernel to support multiple subclusters #401

Open

sirtimid force-pushed the sirtimid/subclusters branch from 59eabea to af76f27 Compare June 3, 2025 20:41

sirtimid requested review from grypez and FUDCo June 3, 2025 20:42

grypez previously approved these changes Jun 4, 2025

View reviewed changes

sirtimid added 11 commits June 4, 2025 20:46

support multiple subclusters

164989e

Add subcluster store methods

819b351

Impement multiple subclusters in the UI

839bb9f

Fix reloading

c7c667d

Simplify API

aae910f

fix scrolling and ui styles

3d546f8

Fix jsdocs

d5cb8a4

increase timeout

f94b359

Fix e2e tests and add rogue vats table

4bcc624

Disallow vat trasnfer to subcluster and apply comment suggestions

0504f30

Show subcluster config on a modal and extarct re-used components

bca937f

sirtimid dismissed grypez’s stale review via bca937f June 4, 2025 18:51

sirtimid force-pushed the sirtimid/subclusters branch from c497573 to bca937f Compare June 4, 2025 18:51

FUDCo approved these changes Jun 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ocap-kernel): Support multiple subclusters #530

feat(ocap-kernel): Support multiple subclusters #530

Uh oh!

sirtimid commented Jun 2, 2025 •

edited

Loading

Uh oh!

grypez left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sirtimid commented Jun 3, 2025 •

edited

Loading

Uh oh!

FUDCo commented Jun 3, 2025

Uh oh!

grypez commented Jun 3, 2025

Uh oh!

grypez commented Jun 3, 2025

Uh oh!

grypez left a comment

Uh oh!

sirtimid commented Jun 4, 2025

Uh oh!

FUDCo left a comment

Uh oh!

Uh oh!

feat(ocap-kernel): Support multiple subclusters #530

Are you sure you want to change the base?

feat(ocap-kernel): Support multiple subclusters #530

Uh oh!

Conversation

sirtimid commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grypez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sirtimid commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FUDCo commented Jun 3, 2025

Uh oh!

grypez commented Jun 3, 2025

Uh oh!

grypez commented Jun 3, 2025

Uh oh!

grypez left a comment

Choose a reason for hiding this comment

Uh oh!

sirtimid commented Jun 4, 2025

Uh oh!

FUDCo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sirtimid commented Jun 2, 2025 •

edited

Loading

sirtimid commented Jun 3, 2025 •

edited

Loading