Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RetryCapacityExceededException not being thrown as expected when circuitBreakerMode is tripped in the StandardRetryStrategy #1321

Open
lauzadis opened this issue Jun 4, 2024 · 0 comments
Assignees
Labels
bug This issue is a bug. p2 This is a standard priority issue queued This issues is on the AWS team's backlog

Comments

@lauzadis
Copy link
Member

lauzadis commented Jun 4, 2024

Describe the bug

When using the StandardRetryStrategy in the SDK, circuitBreakerMode is enabled by default. When launching a highly concurrent workload and getting throttled by an AWS service, the token bucket capacity is quickly consumed and results in the circuit breaker mode being activated and an exception being thrown.

The exception should be RetryCapacityExceededException("Insufficient capacity to attempt another retry"), but is instead the retryable exception returned from the AWS service, leading to confusion as to what's causing the failure.

Expected behavior

Ensure that the correct exception is thrown with a useful message, recommending users to disable circuitBreakerMode if needed.

Current behavior

A retryable service exception is thrown, which is confusing because it appears like it should have been retried.

Steps to Reproduce

Launch 300 parallel calls to lambda::ListFunctions. Most of them will get throttled, and circuit breaker mode will trip, but the exception thrown is not the expected one.

fun main(): Unit = runBlocking {
    val client = LambdaClient.fromEnvironment {
        retryStrategy {
            maxAttempts = 20
        }
    }

    for (i in 0 until 300) {
        launch { client.listFunctions() }
    }
}

Exception with stack trace:

Exception in thread "main" TooManyRequestsException(message=Rate exceeded,reason=CallerRateLimitExceeded,retryAfterSeconds=null,type=User)
	at aws.sdk.kotlin.services.lambda.model.TooManyRequestsException$Builder.build(TooManyRequestsException.kt:82)
	at aws.sdk.kotlin.services.lambda.serde.TooManyRequestsExceptionDeserializer.deserialize(TooManyRequestsExceptionDeserializer.kt:38)
	at aws.sdk.kotlin.services.lambda.serde.ListFunctionsOperationDeserializerKt.throwListFunctionsError(ListFunctionsOperationDeserializer.kt:61)
	at aws.sdk.kotlin.services.lambda.serde.ListFunctionsOperationDeserializerKt.access$throwListFunctionsError(ListFunctionsOperationDeserializer.kt:1)
	at aws.sdk.kotlin.services.lambda.serde.ListFunctionsOperationDeserializer.deserialize(ListFunctionsOperationDeserializer.kt:36)
	at aws.sdk.kotlin.services.lambda.serde.ListFunctionsOperationDeserializer.deserialize(ListFunctionsOperationDeserializer.kt:31)
	at aws.smithy.kotlin.runtime.http.operation.DeserializeHandler.call(SdkOperationExecution.kt:338)
	at aws.smithy.kotlin.runtime.http.operation.DeserializeHandler$call$1.invokeSuspend(SdkOperationExecution.kt)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104)
	at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:277)
	at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:95)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:69)
	at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:48)
	at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
	at sdktest.LambdaThrottlesKt.main(LambdaThrottles.kt:15)
	at sdktest.LambdaThrottlesKt.main(LambdaThrottles.kt)

Possible Solution

Make sure the RetryCapacityExceededException is not somehow being suppressed.

Context

Related to #1318. If the correct exception was thrown, the user could have determined the cause quicker.

AWS SDK for Kotlin version

1.2.23

Platform (JVM/JS/Native)

JVM

Operating system and version

macOS Sonoma

@lauzadis lauzadis added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. and removed needs-triage This issue or PR still needs to be triaged. labels Jun 4, 2024
@ianbotsf ianbotsf self-assigned this Jun 5, 2024
@RanVaknin RanVaknin added p2 This is a standard priority issue queued This issues is on the AWS team's backlog labels Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p2 This is a standard priority issue queued This issues is on the AWS team's backlog
Projects
None yet
Development

No branches or pull requests

3 participants