-
Notifications
You must be signed in to change notification settings - Fork 280
Description
Morning all,
I'm having a hard time debugging an issue with a broker that I set up recently.
I have setup both a production and staging MQTT broker on the same server using a similar configuration. Both TCP and WS are configured with TLS and unsecured connections available on different ports. Neither broker is doing any real load at the moment as we're still in development (2-3 devices pushing data every 15 seconds with half a dozen total connections).
After a few days the production broker locks up, refusing connections but only on the TLS enabled port. I'm using LetsEncrypt rather than self-signed certificates which seems to be the only departure from the documented configuration.
Once locked up, the logs from journalctl
show that it is still running correctly for some clients, whilst others are disconnected and new ones refused. Connecting to the non-TLS port works during this time.
I'm having issues debugging this due to a lack of failures or errors logged, other than clients timing out.
Trying to verify the certificates using echo -n Q | openssl s_client -servername mqtt.tracksyte.com -connect mqtt.tracksyte.com:8883 | openssl x509 -noout -dates
works when the broker is working correctly, but hangs when the broker does too, suggesting to me that the issue isn't the certificates.
Here is my config (with auth redacted):
id = 0
[router]
max_connections = 10010
max_outgoing_packet_count = 200
max_segment_size = 104857600
max_segment_count = 10
[v4.1]
name = "v4-1"
listen = "0.0.0.0:1883"
next_connection_delay_ms = 1
[v4.1.connections]
connection_timeout_ms = 60000
max_payload_size = 20480
max_inflight_count = 100
auth = { example_user = "kasjhdaoijsdaknsbdamn" }
# TLS configuration for secure connections
[v4.2]
name = "v4-2-tls"
listen = "0.0.0.0:8883"
next_connection_delay_ms = 10
[v4.2.tls]
capath = "/etc/letsencrypt/live/mqtt.tracksyte.com/fullchain.pem"
certpath = "/etc/letsencrypt/live/mqtt.tracksyte.com/fullchain.pem"
keypath = "/etc/letsencrypt/live/mqtt.tracksyte.com/privkey.pem"
[v4.2.connections]
connection_timeout_ms = 60000
max_payload_size = 20480
max_inflight_count = 100
auth = { example_user = "kasjhdaoijsdaknsbdamn" }
[ws.1]
name = "ws-1"
listen = "0.0.0.0:8083"
next_connection_delay_ms = 1
[ws.1.connections]
connection_timeout_ms = 60000
max_client_id_len = 256
throttle_delay_ms = 0
max_payload_size = 20480
max_inflight_count = 500
max_inflight_size = 1024
auth = { example_user = "kasjhdaoijsdaknsbdamn" }
[ws.2]
name = "ws-2"
listen = "0.0.0.0:8084"
next_connection_delay_ms = 1
[ws.2.tls]
capath = "/etc/letsencrypt/live/mqtt.tracksyte.com/fullchain.pem"
certpath = "/etc/letsencrypt/live/mqtt.tracksyte.com/fullchain.pem"
keypath = "/etc/letsencrypt/live/mqtt.tracksyte.com/privkey.pem"
[ws.2.connections]
connection_timeout_ms = 60000
max_client_id_len = 256
throttle_delay_ms = 0
max_payload_size = 20480
max_inflight_count = 500
max_inflight_size = 1024
auth = { example_user = "kasjhdaoijsdaknsbdamn" }
[console]
listen = "0.0.0.0:3030"
Any suggestions? My next steps probably look like using rumqttd
as a dependency rather than directly to see if that makes a difference, or using an alternative like nanomq to isolate whether the issue is with rumqttd or the LE certificates.