Skip to content

Migrate internal monitoring to free 3rd party service #208

@arm4b

Description

@arm4b

Internal infrastructure includes a st2monitoring server with the dashboard and client checks (services, memory, processes, ports) for each internal infra node including st2cicd server, as well as external checks (APIs, SSL cert expiry, Domains, ST2 websites availability health checks).

In order to reduce the amount of infra, costs, moving pieces, and relying less on AWS resources (see https://github.com/orgs/StackStorm/projects/27), remove the st2monitoring server and start migrating to free 3rd party service for monitoring and alerting.

For example, we could use Scalyr (where @Kami works).

There are several sub-tasks here:

  • research the monitoring/alerting platform (if Scalyr is good)
  • create and configure 3rd party monitoring st2 TSC account
    • shared account/email with the TSC
    • monitoring alerts should go to #opstown Slack
  • setup external API/web checks:
    • APIs
    • SSL + expiry
    • Domains + expiry
    • Health checks:
      • stackstorm.com
      • stackstorm.org
      • index.stackstorm.org
      • helm.stackstorm.com
      • api.stackstorm.com
      • docs.stackstorm.com
      • st2cicd webhook endpoints
  • create internal checks for st2cicd:
    • via 3rd party monitoring agent/client
    • migrate st2cicd internal checks: memory, CPU, services, processes, etc, etc

Example with external checks:
monitoring

Example for st2cicd server:
image

Finishing the first part with migrating the external checks would be already great. We can remove the monitoring at that point which would save us $60/mo in AWS.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions