noncebalancer: use endpointsharding, ignore ready status#8679
Draft
noncebalancer: use endpointsharding, ignore ready status#8679
Conversation
The old noncebalancer only saw READY SubConns, which was a problem during the brief periods when a SubConn needed to reconnect (for instance due to a GOAWAY from the server). Unfortunately, that's all the balancer interface provides. And we can't get it to pass non-READY SubConns to our picker without reimplementing or copying all its SubConn management logic. Luckily, grpc provides the [`endpointsharding`] balancer implementation that does exactly what we want. It maintains a collection of child balancers each owning a single endpoint (note: for our purposes an endpoint is equivalent to addresses, though it can be one-to-many). It also lets us query the [state] of each child, including the endpoint it's responsible for us. This allows us to construct a picker that is aware of all available backends, even those that aren't currently READY. That, in turn, prevents us from temporarily serving errors while a given nonce redemption backend reconnects. To see an example of `endpointsharding` in use, see the [`customroundrobin`] implementation. For more context on how `endpointsharding` came to be implemented, see [gRFC A61: IPv4 and IPv6 Dualstack Backend Support](a61). [`endpointsharding`]: https://pkg.go.dev/google.golang.org/grpc/balancer/endpointsharding [state]: https://pkg.go.dev/google.golang.org/grpc/balancer/endpointsharding#ChildState [a61]: https://github.com/grpc/proposal/blob/master/A61-IPv4-IPv6-dualstack-backends.md [`customroundrobin`]: https://github.com/grpc/grpc-go/blob/99f36d4a0c28bc967a8d3fe23ebc2a264b322070/examples/features/customloadbalancer/client/customroundrobin/customroundrobin.go
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The old noncebalancer only saw READY SubConns, which was a problem during the brief periods when a SubConn was reconnecting (for instance due to a GOAWAY from the server), since nonce redemption requests are not fungible between backends. Unfortunately, READY SubConns are all that the balancer interface provides. And we can't get that interface to pass non-READY SubConns to our picker without reimplementing or copying all its SubConn management logic.
Luckily, grpc provides the
endpointshardingbalancer implementation that does exactly what we want. It maintains a collection of child balancers each owning a single endpoint (note: for our purposes an endpoint is equivalent to addresses, though it can be one-to-many). It also lets us query the state of each child, including the endpoint it's responsible for us.This allows us to construct a picker that is aware of all available backends, even those that aren't currently READY. That, in turn, prevents us from temporarily serving errors while a given nonce redemption backend reconnects.
To see another example of
endpointshardingin use, see thecustomroundrobinimplementation.For more context on how
endpointshardingcame to be implemented, see gRFC A61: IPv4 and IPv6 Dualstack Backend Support.If you're curious how
endpointshardingpasses around the information about non-READY SubConns, it uses a type assertion from abalancer.Pickerto its internal type.Alternative to #8672. Fixes #8662.
This edits
noncebalancer.goin place for ease of diffing, but we may want to split it to a newnoncev2balancer and control it with a feature flag as #8672 does.