Skip to content

rumqttd broker will start refusing connections after a period of time #978

@jamiedumont

Description

@jamiedumont

Morning all,

I'm having a hard time debugging an issue with a broker that I set up recently.

I have setup both a production and staging MQTT broker on the same server using a similar configuration. Both TCP and WS are configured with TLS and unsecured connections available on different ports. Neither broker is doing any real load at the moment as we're still in development (2-3 devices pushing data every 15 seconds with half a dozen total connections).

After a few days the production broker locks up, refusing connections but only on the TLS enabled port. I'm using LetsEncrypt rather than self-signed certificates which seems to be the only departure from the documented configuration.

Once locked up, the logs from journalctl show that it is still running correctly for some clients, whilst others are disconnected and new ones refused. Connecting to the non-TLS port works during this time.

I'm having issues debugging this due to a lack of failures or errors logged, other than clients timing out.

Trying to verify the certificates using echo -n Q | openssl s_client -servername mqtt.tracksyte.com -connect mqtt.tracksyte.com:8883 | openssl x509 -noout -dates works when the broker is working correctly, but hangs when the broker does too, suggesting to me that the issue isn't the certificates.

Here is my config (with auth redacted):

id = 0

[router]
max_connections = 10010
max_outgoing_packet_count = 200
max_segment_size = 104857600
max_segment_count = 10

[v4.1]
name = "v4-1"
listen = "0.0.0.0:1883"
next_connection_delay_ms = 1
        [v4.1.connections]
        connection_timeout_ms = 60000
        max_payload_size = 20480
        max_inflight_count = 100
        auth = { example_user = "kasjhdaoijsdaknsbdamn" }

# TLS configuration for secure connections
[v4.2]
name = "v4-2-tls"
listen = "0.0.0.0:8883"
next_connection_delay_ms = 10
        [v4.2.tls]
        capath = "/etc/letsencrypt/live/mqtt.tracksyte.com/fullchain.pem"
        certpath = "/etc/letsencrypt/live/mqtt.tracksyte.com/fullchain.pem"
        keypath = "/etc/letsencrypt/live/mqtt.tracksyte.com/privkey.pem"
        [v4.2.connections]
        connection_timeout_ms = 60000
        max_payload_size = 20480
        max_inflight_count = 100
        auth = { example_user = "kasjhdaoijsdaknsbdamn" }

[ws.1]
name = "ws-1"
listen = "0.0.0.0:8083"
next_connection_delay_ms = 1
        [ws.1.connections]
        connection_timeout_ms = 60000
        max_client_id_len = 256
        throttle_delay_ms = 0
        max_payload_size = 20480
        max_inflight_count = 500
        max_inflight_size = 1024
        auth = { example_user = "kasjhdaoijsdaknsbdamn" }

[ws.2]
name = "ws-2"
listen = "0.0.0.0:8084"
next_connection_delay_ms = 1
        [ws.2.tls]
        capath = "/etc/letsencrypt/live/mqtt.tracksyte.com/fullchain.pem"
        certpath = "/etc/letsencrypt/live/mqtt.tracksyte.com/fullchain.pem"
        keypath = "/etc/letsencrypt/live/mqtt.tracksyte.com/privkey.pem"
        [ws.2.connections]
        connection_timeout_ms = 60000
        max_client_id_len = 256
        throttle_delay_ms = 0
        max_payload_size = 20480
        max_inflight_count = 500
        max_inflight_size = 1024
        auth = { example_user = "kasjhdaoijsdaknsbdamn" }

[console]
listen = "0.0.0.0:3030"

Any suggestions? My next steps probably look like using rumqttd as a dependency rather than directly to see if that makes a difference, or using an alternative like nanomq to isolate whether the issue is with rumqttd or the LE certificates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions