Skip to content

Monitoring, statistics and alerting system #937

@nour-massri

Description

@nour-massri

During battlecode 2025 we experience a variety of issues:

  • matches that last for a long time -> immediate solution: needed to cancel them
  • queue get so big and saturn isn't executing any -> immediate solution: purge messages in pub/sub and re-queue again
  • ranked matches rating not calculated -> recalculate all ratings from oldest to newest

these problem shouldn't have happened in the first place and we are working on making our system more robust to this kinda of problems, however this brings the importance of having a monitoring system that monitors the functionality of our system and reports any disruptions by emailing the devs

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: backendRelated to the Siarnaq backend modulemodule: devopsRelated to deployments and other operationsmodule: saturnRelated to the Saturn modulepriority: p3 lowtype: featureNew feature or request, or quick non-essential bugfix

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions