Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support a handler for checking connection status using Ping frame in HTTP/2 #3612

Open
wants to merge 21 commits into
base: 1.2.x
Choose a base branch
from

Conversation

raccoonback
Copy link
Contributor

@raccoonback raccoonback commented Feb 2, 2025

Description

In some cases, HTTP/2 connections may remain open even when they are no longer functional. By introducing a periodic Ping frame health check, we can proactively detect and close unhealthy connections.

The Http2ConnectionLivenessHandler schedules and sends Ping frames at a user-defined interval to verify the connection's liveness. If an acknowledgment (ACK) is not received within the specified time, the connection is considered unhealthy and is closed automatically.

However, if other frames are actively being sent or received, the scheduler does not send a Ping frame. This is because the server may delay ACK responses for various reasons. To prevent unnecessary connection termination, Ping frames are only sent when no read or write activity is detected.

Additionally, a configurable retry threshold for Ping frame transmission has been introduced. If a Ping frame fails to receive an ACK response, it will be retried up to the specified threshold before considering the connection unhealthy. This allows fine-tuning of the failure detection mechanism, balancing between aggressive failure detection and avoiding premature disconnections.

Scheduler Flow

http2_ping_flows_1

Ping ACK Flow

http2_ping_flow

Key Changes

  1. Added Http2ConnectionLivenessHandler handler to check connection health with Ping frames at a configurable interval.
  2. Introduced a retry threshold setting to limit the number of Ping transmission attempts before marking the connection as unhealthy.
  3. Specify the ping frame ack allowable range through Http2SettingsSpec.
HttpClient.create()
	.protocol(HttpProtocol.H2)
	.secure()            
	.http2Settings(
		 builder -> builder.pingAckTimeout(Duration.ofMillis(600))  // Max wait time for ACK
			.pingScheduleInterval(Duration.ofMillis(300))  // Interval for sending PING frames
			.pingAckDropThreshold(2)  // Maximum retries before considering the connection dead
         );

Related Issue

@raccoonback raccoonback marked this pull request as ready for review February 2, 2025 08:17
@raccoonback raccoonback force-pushed the issue-3301 branch 9 times, most recently from 171dfc5 to 202d036 Compare February 4, 2025 00:11
@raccoonback raccoonback marked this pull request as draft February 5, 2025 16:36
@raccoonback raccoonback marked this pull request as ready for review February 7, 2025 17:12
@violetagg violetagg added the type/enhancement A general enhancement label Feb 10, 2025
@violetagg violetagg added this to the 1.2.4 milestone Feb 10, 2025
@violetagg
Copy link
Member

@raccoonback Can you rebase against branch 1.2.x and switch the target branch for the PR to be 1.2.x. We branched and now in main it is the next development 1.3.x while we want this feature for version 1.2.x

@raccoonback raccoonback changed the base branch from main to 1.2.x February 11, 2025 23:47
@raccoonback
Copy link
Contributor Author

@violetagg
Hello!
I have rebased the branch onto 1.2.x and changed the target branch of the PR to 1.2.x.

- Added a method to configure the execution interval of the scheduler
  that sends HTTP/2 PING frames and periodically checks for ACK responses
- Introduced a retry threshold setting to limit the number of PING transmission attempts
  before considering the connection as unresponsive
- Default values:
  - Scheduler interval must be explicitly set
  - Retry threshold defaults to 0 (no retries, only one PING attempt)

Signed-off-by: raccoonback <[email protected]>
@violetagg
Copy link
Member

@raccoonback I'll review this PR later this week

@violetagg violetagg self-requested a review February 17, 2025 15:25
@violetagg
Copy link
Member

violetagg commented Feb 20, 2025

@raccoonback Nice idea! I think that we can reuse our current reactor.netty.http.server.IdleTimeoutHandler (we need to check whether the current position in the pipeline will be ok or whether we need to move it) and extend it with the HTTP/2 functionality ACK requirement. At the moment it handles idle timeout for HTTP/2 and H2C on the server. The more ChannelHandler you have in the pipeline the worse is the performance so we always try to keep the ChannelHandlers number to the minimum if possible. Also all methods in reactor.netty.http.server.IdleTimeoutHandler are invoked on the event loop which guarantees that the functionality is thread-safe.

I think this feature is interesting for both server and client, similar to TCP keep-alive that we support for both of them.

@raccoonback
Copy link
Contributor Author

raccoonback commented Feb 20, 2025

@violetagg
Hello, and thanks for your suggestion! 😃
I have a few questions for clarification.

  1. You suggested reusing reactor.netty.http.server.IdleTimeoutHandler.
    Currently, the implementation works by scheduling a PING frame transmission when there is no read/write activity, then checking whether an ACK is received within the allowed time. (It also supports retries up to a certain threshold.)
    '
    I understand that you’re suggesting leveraging IdleTimeoutHandler to detect when both read and write operations are idle and incorporating it into this logic. Is that correct?
    Specifically, do you mean modifying the handler so that when both read and write operations are idle, a PING frame is sent, and if no ACK is received within the retry threshold, the connection is closed?
    '
    If I’ve misunderstood, could you clarify what exactly you mean by "reuse"? 😮

  2. Can I understand that you are proposing to extend this feature not only on the client side, but also on the server side by IdleTimeoutHandler?

    • I thought that this Issue was intended to introduce PING frame support only for the client-side at first.

Looking forward to your feedback! 😊

@violetagg
Copy link
Member

violetagg commented Feb 21, 2025

@violetagg Hello, and thanks for your suggestion! 😃 I have a few questions for clarification.

  1. You suggested reusing reactor.netty.http.server.IdleTimeoutHandler.
    Currently, the implementation works by scheduling a PING frame transmission when there is no read/write activity, then checking whether an ACK is received within the allowed time. (It also supports retries up to a certain threshold.)
    '
    I understand that you’re suggesting leveraging IdleTimeoutHandler to detect when both read and write operations are idle and incorporating it into this logic. Is that correct?

Yes
Currently we use it only for checking that there is no read, because we add this handler between the requests and we are sure there is no write operation. However this handler extends io.netty.handler.timeout.IdleStateHandler and for HTTP/2 we can configure it to check for both read and write.

Specifically, do you mean modifying the handler so that when both read and write operations are idle, a PING frame is sent, and if no ACK is received within the retry threshold, the connection is closed?

Yes
When we receive channelIdle we can add the logic for ACK

'
If I’ve misunderstood, could you clarify what exactly you mean by "reuse"? 😮
2. Can I understand that you are proposing to extend this feature not only on the client side, but also on the server side by IdleTimeoutHandler?

Yes

  • I thought that this Issue was intended to introduce PING frame support only for the client-side at first.

Looking forward to your feedback! 😊

We may need to move reactor.netty.http.server.IdleTimeoutHandler to the reactor.netty.http package so that it can be used by both server and client.

I think that this feature is interesting for both client and server. Wdyt?

@raccoonback
Copy link
Contributor Author

@violetagg

Thank you for the detailed explanation!
I will update reactor.netty.http.server.IdleTimeoutHandler to check the Ping ACK internally when usingHTTP/2 and H2C.

@violetagg violetagg modified the milestones: 1.2.4, 1.2.5 Mar 6, 2025
@raccoonback
Copy link
Contributor Author

@violetagg
Hello!
I will reflect your PR review soon.
Thanks!

- Added support for HTTP/2 PING-based health checks in IdleTimeoutHandler
- Ensures connections remain active during health checks

Signed-off-by: raccoonback <[email protected]>
Signed-off-by: raccoonback <[email protected]>
@raccoonback raccoonback marked this pull request as draft March 15, 2025 02:40
@raccoonback raccoonback marked this pull request as ready for review March 16, 2025 14:06
@raccoonback
Copy link
Contributor Author

raccoonback commented Mar 17, 2025

@violetagg
Hello!
Updated to work based on IdleTimeoutHandler.

Signed-off-by: raccoonback <[email protected]>
Comment on lines +117 to +132
if (isPingIntervalConfigured()) {
if (pingScheduler == null) {
isPingAckPending = false;
pingAckDropCount = 0;
pingScheduler = ctx.executor()
.schedule(
new PingTimeoutTask(ctx),
pingAckTimeoutNanos,
NANOSECONDS
);
}

return;
}

ctx.close();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scheduler for checking the connection status based on HTTP/2 Ping frames operates only when pingAckTimeout, pingScheduleInterval, and pingAckDropThreshold are all configured.

If they are not configured, the channel is closed immediately.

Comment on lines 141 to 148
public void receive(Object msg) {
if (msg instanceof Http2PingFrame) {
Http2PingFrame frame = (Http2PingFrame) msg;
if (frame.ack() && frame.content() == lastSentPingData) {
lastReceivedPingTime = System.nanoTime();
}
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a Ping ACK frame is received, the last received time is updated.

Comment on lines +153 to +159
@Override
public void cancel() {
if (pingScheduler != null) {
pingScheduler.cancel(false);
pingScheduler = null;
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scheduler is canceled when the channel becomes inactive or an exception occurs.

Comment on lines +194 to +213
if (isOutOfTimeRange()) {
countPingDrop();

if (isExceedAckDropThreshold()) {
if (log.isInfoEnabled()) {
log.info("Closing the channel due to delayed ping frame response (timeout: {} ns). {}", pingAckTimeoutNanos, channel);
}

close();
return;
}

if (log.isInfoEnabled()) {
log.info("Dropping ping ACK frame in channel (ping data: {}). channel: {}", lastSentPingData, channel);
}

writePing(ctx);
pingScheduler = invokeNextSchedule();
return;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Ping ACK is not received within pingAckTimeoutNanos, but the retry count has not yet reached the pingAckDropThreshold, a retry is attempted.

Comment on lines +215 to +217
isPingAckPending = false;
pingAckDropCount = 0;
pingScheduler = invokeNextSchedule();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Ping ACK is not received within pingAckTimeoutNanos and retries have reached the pingAckDropThreshold,
the connection is considered invalid and the channel is closed.

Comment on lines +808 to +812
IdleTimeoutHandler.addIdleTimeoutServerHandler(
p,
idleTimeout,
new HttpConnectionImmediateClose()
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When upgrading from HTTP/1.1 to H2C, the IdleTimeoutHandler in H2CleartextCodec is also modified.
Therefore, when starting with the HTTP/1.1 protocol, the IdleTimeoutHandler is initially set to HttpConnectionImmediateClose.

Comment on lines +889 to +893
IdleTimeoutHandler.addIdleTimeoutServerHandler(
p,
idleTimeout,
new HttpConnectionImmediateClose()
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For HTTP/1.1, an IdleTimeoutHandler is added as HttpConnectionImmediateClose.

Comment on lines +1037 to +1049
if (idleTimeout != null) {
IdleTimeoutHandler.removeIdleTimeoutHandler(pipeline);
IdleTimeoutHandler.addIdleTimeoutServerHandler(
pipeline,
idleTimeout,
new Http2ConnectionLiveness(
upgrader.http2FrameCodec,
http2SettingsSpec != null ? http2SettingsSpec.pingAckTimeout() : null,
http2SettingsSpec != null ? http2SettingsSpec.pingScheduleInterval() : null,
http2SettingsSpec != null ? http2SettingsSpec.pingAckDropThreshold() : null
)
);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In H2CleartextCodec, when upgrading from HTTP/1.1 to the H2C protocol, the IdleTimeoutHandler is changed to be based on Http2ConnectionLiveness.

// When the server is configured with HTTP/1.1 and H2 and HTTP/1.1 is negotiated,
// when channelActive event happens, this HttpTrafficHandler is still not in the pipeline,
// and will not be able to add IdleTimeoutHandler. So in this use case add IdleTimeoutHandler here.
IdleTimeoutHandler.addIdleTimeoutHandler(ctx.pipeline(), idleTimeout);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to be performed in configureHttp11OrH2CleartextPipeline().

@@ -546,7 +539,6 @@ void handleLastHttpContent(Object msg, ChannelPromise promise) {
ctx.executor().execute(this);
}
else {
IdleTimeoutHandler.addIdleTimeoutHandler(ctx.pipeline(), idleTimeout);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified to add the IdleTimeoutHandler for each protocol in HttpServerConfig instead of handling it in HttpTrafficHandler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement A general enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTTP/2 PING frame handling
2 participants