Skip to content

Handle Offline Operators to Maintain Domain Throughput #3645

@jfrank-summit

Description

@jfrank-summit

Problem

Inactive operators who remain staked but go offline continue to be eligible for bundle production elections based on their stake weight. Since they are offline, they fail to produce bundles when selected, which reduces the domain's overall throughput proportional to their stake percentage. This can significantly impact performance, especially for operators with large stakes, as the system continues to allocate slots to them without any output.

Key issues:

  • Decreased bundle production rate
  • Wasted election slots
  • Potential for liveness attacks if adversaries stake without running nodes
  • No automatic mechanism to detect and handle prolonged inactivity

This was partially mitigated in PR #2215, but a more complete solution is needed.

Proposed Solution

Implement a mechanism to automatically detect offline operators and temporarily exclude their stake from elections, while providing a way for them to signal their return to active status. We are researching various detection heuristics and exclusion algorithms to balance reliability, security, and operator experience.

Key Features

  • Automatic detection of inactivity (e.g., probabilistic based on bundle production expectations over epochs)
  • Temporary exclusion from elections without forced unstaking
  • Simple reactivation process via extrinsic

Outcomes

  • Mechanism prevents throughput degradation from offline operators
  • False positives are minimized (active operators not mistakenly excluded)
  • Security against gaming the system

Note: Replaces #2629

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions