Open
Description
What happened?
When the remediation component fails to connect to LAPI currently with nftables, the whole service comes down and flushes the nftables set
time="10-05-2024 11:06:07" level=info msg="Processing new and deleted decisions . . ."
time="10-05-2024 11:07:07" level=error msg="http code 504, invalid body: invalid character '<' looking for beginning of value"
time="10-05-2024 11:07:07" level=info msg="Shutting down backend"
time="10-05-2024 11:07:07" level=info msg="flushing 'crowdsec-blacklists' set in 'crowdsec' table"
time="10-05-2024 11:07:07" level=info msg="flushing 'crowdsec6-blacklists' set in 'crowdsec6' table"
time="10-05-2024 11:07:07" level=fatal msg="process terminated with error: bouncer stream halted"
time="10-05-2024 11:07:17" level=info msg="Starting crowdsec-firewall-bouncer v0.0.28-debian-pragmatic-af6e7e25822c2b1a02168b99ebbf8458bc6728e5"
time="10-05-2024 11:07:17" level=info msg="backend type : nftables"
time="10-05-2024 11:07:17" level=info msg="nftables initiated"
This is not what we want as the IP's currently within set are useful to the service.
What did you expect to happen?
Remediation component should allow for failures to connect to LAPI after the service has started, EG connect first if failed at startup then yes restart but after that should be resilient
How can we reproduce it (as minimally and precisely as possible)?
Bring up a LAPI and firewall remediation, currently user has reported if the response code > 500 the service comes down
Anything else we need to know?
No response
version
remediation component version:
$ crowdsec-firewall-bouncer --version
# paste output here
crowdsec version
crowdsec version:
$ crowdsec --version
# paste output here
OS version
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here