Skip to content

Conversation

@wvuong
Copy link

@wvuong wvuong commented Jan 22, 2025

Motivation:

Add GrpcHealthCheckedEndpointGroupBuilder which builds a health checked endpoint group whose health comes from a standard gRPC health check service result.

Modifications:

  • Adds GrpcHealthCheckedEndpointGroupBuilder which extends AbstractHealthCheckedEndpointGroupBuilder and creates a new health check function
  • Adds GrpcHealthChecker which is the health check function that creates and uses a gRPC HealthGrpc stub to check the gRPC health service on the endpoint. If the health check response is SERVING, it is healthy. It is unhealthy if the response is not SERVING or if there was a request failure.
  • Adds tests.

Result:

Motivation:

Add `GrpcHealthCheckedEndpointGroupBuilder` which builds a health checked endpoint group whose health comes from a [standard gRPC health check service result](https://grpc.io/docs/guides/health-checking/).

Modifications:

* Adds `GrpcHealthCheckedEndpointGroupBuilder` which extends `AbstractHealthCheckedEndpointGroupBuilder` and creates a new health check function
* Adds `GrpcHealthChecker` which is the health check function that creates and uses a gRPC `HealthGrpc` stub to check the gRPC health service on the endpoint. If the health check response is `SERVING`, it is healthy. It is unhealthy if the response is not `SERVING` or if there was a request failure.
* Adds tests.

Result:

* A user can create a health checked endpoint group that is backed by a gRPC health check service.
* Closes line#5930
@CLAassistant
Copy link

CLAassistant commented Jan 22, 2025

CLA assistant check
All committers have signed the CLA.

if (healthCheckResponse.getStatus() == HealthCheckResponse.ServingStatus.SERVING) {
ctx.updateHealth(HEALTHY, reqCtx, null, null);
} else {
// not sure about the response headers but it needs to be non-null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to leave the headers as null for now.

Alternatively, we can also extract the headers from the ctx log

ResponseHeaders responseHeaders = null;
if (reqCtx.log().isAvailable(RequestLogProperty.RESPONSE_HEADERS)) {
    responseHeaders = reqCtx.log().partial().responseHeaders();
}

lock();
try {
final HealthCheckRequest.Builder builder = HealthCheckRequest.newBuilder();
if (this.service != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (this.service != null) {
if (service != null) {

Comment on lines 91 to 99
public void onError(Throwable throwable) {
final ClientRequestContext reqCtx = reqCtxCaptor.get();
// same here
ctx.updateHealth(UNHEALTHY, reqCtx, UNHEALTHY_RESPONSE_HEADERS, throwable);
}

@Override
public void onCompleted() {
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question) If the connection is dropped, is there no need to retry the check (with a backoff)?

}

public void start() {
check();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question) Is there no need to also implement watch for completeness? Also, what do you think of allowing users to configure which method to use?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://grpc.io/docs/guides/health-checking/

The health check service on a gRPC server supports two modes of operation:

  • Unary calls to the Check rpc endpoint
    • Useful for centralized monitoring or load balancing solutions, but does not scale to support a fleet of gRPC client constantly making health checks
  • Streaming health updates by using the Watch rpc endpoint
    • Used by the client side health check feature in gRPC clients

Watch seems to be the preferred mode for standard gRPC clients.

@minwoox minwoox added this to the 1.33.0 milestone Feb 12, 2025
Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be satisfied as long as check runs periodically and watch is implemented.

}

try (ClientRequestContextCaptor reqCtxCaptor = Clients.newContextCaptor()) {
stub.check(builder.build(), new StreamObserver<HealthCheckResponse>() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check is a unary call so it can't continuously update the upstream's status.
For Check method, we need a scheduler to periodically send Check requests.

}

public void start() {
check();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://grpc.io/docs/guides/health-checking/

The health check service on a gRPC server supports two modes of operation:

  • Unary calls to the Check rpc endpoint
    • Useful for centralized monitoring or load balancing solutions, but does not scale to support a fleet of gRPC client constantly making health checks
  • Streaming health updates by using the Watch rpc endpoint
    • Used by the client side health check feature in gRPC clients

Watch seems to be the preferred mode for standard gRPC clients.

@wvuong
Copy link
Author

wvuong commented Feb 19, 2025

Hi, I have started working through these changes...

* Add GrpcHealthCheckWatcher which uses watch method
* Refactor commoh class AbstractGrpcHealthChecker
* Add support for configuring GrpcHealthCheckedEndpointGroupBuilder to select health check method with GrpcHealthCheckMethod
* Add and update tests, redo HealthGrpcServerExtension
@wvuong
Copy link
Author

wvuong commented Feb 25, 2025

Hi, I've implemented the requested changes:

  • Added the ability to configure GrpcHealthCheckedEndpointGroupBuilder to use either Check or Watch method to health check
  • GrpcHealthCheckWatcher which does the health checking with the Watch method
  • Updated both methods to use the executor to schedule the next check as necessary
  • Other minor cleanup, etc.

Unit tests and checkstyle pass. For the tests, an exception gets logged during the tear down when the health check tries to fire during tear down. I'm not really too sure what to do with that.

And the build-windows-latest-jdk-21 fails but all others pass. 🤷‍♂️

@github-actions github-actions bot added the Stale label Mar 27, 2025
@github-actions github-actions bot removed the Stale label Apr 29, 2025
@github-actions github-actions bot added the Stale label May 29, 2025
@ikhoon ikhoon modified the milestones: 1.33.0, 1.34.0 Aug 1, 2025
@github-actions github-actions bot removed the Stale label Aug 2, 2025
@github-actions github-actions bot added the Stale label Sep 2, 2025
@jrhee17 jrhee17 modified the milestones: 1.34.0, 1.35.0 Nov 24, 2025
@github-actions github-actions bot removed the Stale label Nov 25, 2025
@github-actions github-actions bot added the Stale label Dec 28, 2025
@minwoox minwoox modified the milestones: 1.35.0, 1.36.0 Dec 30, 2025
@github-actions github-actions bot removed the Stale label Jan 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for gRPC health check endpoint

5 participants