Document the ulimit changes needed for running fleet-server in larger environments! #1568
Labels
Team:Docs
Label for the Observability docs team
Team:Elastic-Agent-Control-Plane
Label for the Agent Control Plane team
Describe the enhancement:
Please document the required changes for the ulimit settings, especially on Ubuntu in order to run the fleet server in larger environment. If people do not do this then their fleet servers can stop responding due to having too many open files
Describe a specific use case for the enhancement or feature:
What I have ran into with nearly 4,000 agents and two fleet servers behind a load balancer is hitting a open file limit in Ubuntu with the fleet server. Currently there are no documented ulimit changes for fleet server but there is for Elasticsearch, however the Elasticsearch changes do not apply because Elasticsearch runs under a different user (named elasticsearch) and fleet-server will run under the user root.
To see the current limitation find the PID of the fleet-server process:
pidof fleet-server
Then run the following to see the output of the SOFT limitation:
prlimit -n --pid=[pid]
What you will notice is the soft limitation will still remain at 1024 even if you modify the /etc/systemd/system.conf file.
In order to fix this you must edit /etc/systemd/system.conf and change the following line to something higher:
DefaultLimitNOFILE=262144:524288
Editing the /etc/pam.d/login file to include the pam_limits.so does not fix the SOFT limit problem. This should be properly documented so time is not wasted. Ticket 00975556
The text was updated successfully, but these errors were encountered: