Skip to content

Conversation

@WilliamVoong
Copy link
Collaborator

No description provided.

William Voong added 2 commits November 6, 2025 18:38
The send loop process relies on incoming message in its message queue to
batch properly. However, there is a race condition, the send loop
process might send its data before handling the incoming message
arriving in its message queue, causing unnecessary overhead on the
client side and making batching mechanism unreliable in ered.

This commit handles the removal of the send loop and moving gen_tcp:send
to ered_client.erl.

Change-Id: Ic3aedb38e3caa86d3fa6862a4c976963393cec2f
Implementation of batching in ered_client.erl.

The strategy for batching follows this pattern:

Before hitting ered_client.erl max_pending,
we will not batch any commands.

After hitting max_pending no new commands will
be sent to Valkey/Redis server, and will queue up
commands in its waiting queue, and once we
are below max pending again, commands will be batched
and be sent to redis.

Change-Id: I2fe7e29be829106d9355b105cc190cac0ba0850c
Copy link
Collaborator

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks William. This solves the actually problem, which is the main purpose. I'm posting my comments. Let me think about merging this and doing more stuff later or how to continue. I understand you're busy with other stuff.

This change doesn't actually remove any process. The old send process still exists but it is just doing the connect, starts the recv process and then waits for it to exit. I think we can completely get rid of this process and let ered_connection be only the recv process. The recv process can do the connect. (Also we can turn it into a gen_server, but it might be a larger change that I have some WIP code for.)

command(ClusterRef, Command, Key, Timeout) when is_binary(Key) ->
C = ered_command:convert_to(Command),
gen_server:call(ClusterRef, {command, C, Key}, Timeout).
gen_server:call(ClusterRef, {command, Command, Key}, Timeout).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we talked about: Let's revert these changes. It's good to call convert_to early, before the command is sent between processes. When it's being called again later, it's is a no-op.

Comment on lines +68 to +69
transport_socket = none :: gen_tcp:socket() | ssl:sslsocket(),
recv_pid = none :: pid(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transport_socket is a tuple so it should have a tuple type like {module(), any()} or more exact {gen_tcp, gen_tcp:socket()} | {ssl, ssl:sslsocket()}.

Additionally, the type needs to allow none because it is set to none by default.

Same with recv_pid. The type needs to be none | pid().

I think lines these should be located just after to the other pids in the state, i.e. just after connect_loop_pid and connection_pid. (What is connect_loop_pid? Is it unused?)

{noreply, connection_down(Reason, State)};

handle_info({connected, Pid, ClusterId}, State) ->
handle_info({connected, ConnectionPid, Socket, RecvPid, ClusterId}, State) ->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the connection pid for? I don't think ered_client is using it for anything. We could remove it from this message.

What we need from ered_connection is Transport, Socket and the RecvPid. We could put these three in a tuple and treat it as an opaque connection handle. Technically, ered_client doesn't even need to know what it contains. It could just use it when calling functions in the ered_connection module.

self() ! {socket_closed, Reason}
end.

batch_send(RecvPid, {Socket, Transport}, Commands, Ref) ->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function has duplicated code with send. It's possible to refactor it to avoid the duplicated code.

Additionally, this code that builds RefInfo = {Class, self(), Ref, []} is designed just for the receiving process. Think about this as designing an API between the modules. This RefInfo stuff can be internal details to the ered_connection module. I think the send and batch_send functions should be moved to the ered_connection module, but still run in the caller's process.

Comment on lines +570 to +574
%% This will shut down recv_loop if it is waiting for a reference
RecvPid ! close_down,
%% Ok, recv done, time to die
receive {recv_exit, _Reason} -> ok end,
self() ! {socket_closed, Reason}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Time to die" is about the ered_connection process. It's not true for the ered_client where this code is running. We should edit the comment or remove it (or move it to ered_connection).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants