Skip to content

Long Running AWS Connections Failing #336

@gsexton

Description

@gsexton

I'm running into an issue with long lived connections on AWS ECS containers failing and was hoping for some help.

I'm using:

github.com/go-openapi/runtime v0.28.0
go: 1.23.6

The application has swagger generated clients that are talking to other services. I frequently see context deadline exceed messages in my error logs. I opened a ticket and the VPC support engineer said this was a keepalive/timeout issue. The NAT gateway has an idle connection timeout of 350 seconds.

https://docs.aws.amazon.com/vpc/latest/userguide/nat-gateway-troubleshooting.html#nat-gateway-troubleshooting-timeout

According to the SE, when the connection is closed by the NAT gateway, the client doesn't receive a notification, and it hangs until the context deadline expires. The solution is to either use keepalive packets, or to close the connections and re-open them before the idle connection timeout expires.

Does this diagnosis sound correct?

I looked through the docs and found Runtime.EnableConnectionReuse(), but from reading the docs and looking at the code, it doesn't look like it addresses my problem. I tried it anyhow, and it makes no difference.

I have not tried re-generating the clients using swagger. Would that help?

Any ideas would be really appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions