OpAMP Agent heartbeats

Currently, if an agent is functioning successfully and without any changes to its health or component status, no messages will be sent after initial handshakes. For websockets, because no messages are sent from the agent to the server, after a certain period most of these connections will idle and be shut down causing a broken socket. I propose that we add support for an heartbeat interval as part of the specification. I have already implemented this capability [in the opamp-bridge](https://github.com/open-telemetry/opentelemetry-operator/blob/main/cmd/operator-opamp-bridge/agent/agent.go#L247-L267) where a user is able to configure a heartbeat interval to periodically set the agent's health message. HTTP theoretically gets around this with its polling interval.

This heartbeat implementation is important for the bridge where many of the events in Kubernetes happen asynchronously. The bridge is not informed directly therefore must poll state to send this message. The collector's opamp extension, however, watches for some changes but is currently not busy. For the extension, the collector will idle after sometime and prevent the server from being able to send any more messages to the extension. 

I think a heartbeat interval could be optional, however, it must be communicated as part of the initial AgentToServer message. This would allow the server to know when to mark the agent as unhealthy.

### Open Questions
* Should this functionality only exist for the socket transport?
  * I believe it would be valuable for this to exist regardless of the transport and that the HTTP poll interval could be deprecated in favor of this. This would allow the server to make decisions about the liveness of the agent regardless of the transport
* What should the default interval be? 30s?
* Should the heartbeat interval be negotiable from the server?

I'm happy to write the spec change for this issue, but would love everyone's thoughts here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OpAMP Agent heartbeats #183

Open Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OpAMP Agent heartbeats #183

Description

Open Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions