-
Notifications
You must be signed in to change notification settings - Fork 212
Description
We are seeing OOM kills in our service when performing large batch operations if the Aerospike Go client is configured with:
LimitConnectionsToQueueSize bool
When this is set to false, the client appears to allocate excessive connections to cluster nodes, causing memory usage to spike until it hits the container memory limit (~7GB), resulting in the process being OOM killed. Additionally, we observe high latencies and errors during batch operations under this configuration.
When the flag is set to true, memory usage remains stable and no OOMs occur.
Steps to Reproduce:
- Configure the Aerospike Go client with LimitConnectionsToQueueSize = false.
- Perform repeated batch writes with a high no of keys.
- Observe memory usage grow rapidly until OOM.
Behavior Observed:
- Memory usage grows continuously during operations.
- Process eventually gets OOM killed (~7GB memory footprint in our environment).
Environment:
- Aerospike server: v8, 3-node cluster, RF=2
- Nodes: n2d-highmem-32 (GCP)
- Aerospike Go client: github.com/aerospike/aerospike-client-go/v8 v8.1.0
Are we misusing this flag, or is there a issue in the Go client when LimitConnectionsToQueueSize is set to false?
Attached the heap profile details.
