Skip to content

TLS connection error oversize record received #164

Open
@beyhan

Description

@beyhan

We are from the BOSH team, and we use nats.rb client with TLS enabled. We have a problem with NATS server error: oversized record received.
So we observed that the client sends plaintext messages to server before TLS is started, which results in the oversized error.
We face the issue only when there is heavy load of NATS messages from BOSH Director. We use nats.rb version 0.9.2, but we tried with 0.11.0 as well and the issue was reproducible.
Here an example what we see with tshark. Director is the client and uses the nats.rb version 0.9.2. nats is the server in version 1.3.0, which is forked and can be found here.

In case the connection fails with TLS handshake error:

director -> nats (SYNC)
director <- nats (SYNC, ACK)
director -> nats (ACK)
director -> nats (PSH) 
payload: CONNECT {"verbose":false,"pedantic":false,"lang":"ruby","version":"0.9.2","protocol":1,"ssl_required":true,"tls_required":true}
director <- nats (ACK)
director <- nats (PSH, ACK)
payload:
INFO {"server_id":"xA5UfriyGj2oHzAaQH42Ji","version":"1.3.0-bosh.2 2018-10-25T15:57:27Z a79a47d","proto":1,"go":"go1.11","host":"0.0.0.0","port":4222,"auth_required":true,"tls_required":true,"tls_verify":true,"max_payload":1048576,"client_id":4172}

Normally when the communication is successful:

director -> nats (SYNC)
director <- nats (SYNC, ACK)
director -> nats (ACK)
director <- nats (PSH) 
payload:
INFO {"server_id":"xA5UfriyGj2oHzAaQH42Ji","version":"1.3.0-bosh.2 2018-10-25T15:57:27Z a79a47d","proto":1,"go":"go1.11","host":"0.0.0.0","port":4222,"auth_required":true,"tls_required":true,"tls_verify":true,"max_payload":1048576,"client_id":4173} 
director -> nats (ACK)
director -> nats (PSH, ACK)
payload:ëE9@@%�~ÕRÁ@n;è^A¤ÈÕ¤ÈÔüôÖz,N©I7)qrMÃì<0aX´c9Ü,Ðñª°À0À,À(À$ÀÀ¥£¡kjih9876�ÀÀ2À.À*À&ÀÀ=5À/À+À'À#ÀÀ¤¢ g@?>3210EDCBÀÀ1À-À)À%ÀÀ</AÀÀÀÀÀÀÀÀÀÀÿ##Ê

The difference is that the NATS CONNECT message in the error case is sent before the INFO message and before the TLS is started. The main issue for us is that we lose messages that do not reach the NATS server in this scenario. Is there a guarantee that the NATS client will deliver the messages to the NATS server or in case of reconnect errors some may be lost (e.g. in EM/SSL buffers).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions