Skip to content

[BUG] Gateway mode returns generic CosmosException instead of specific exception types (NotFoundException, BadRequestException, etc.) #47823

@Ritvik-Jayaswal

Description

@Ritvik-Jayaswal

Describe the bug

When using Gateway mode, the Java SDK always returns a generic CosmosException regardless of the HTTP status code. In contrast, Direct mode correctly returns specific exception subclasses like NotFoundException, BadRequestException, ForbiddenException, etc., based on the status code.

This causes inconsistent behavior between connection modes and breaks code that relies on catching specific exception types:

try {
    container.readItem(id, new PartitionKey(pk), MyClass.class);
} catch (NotFoundException e) {
    // This works in Direct mode but NOT in Gateway mode
    return Optional.empty();
}

Root Cause Analysis

The issue is in the RxGatewayStoreModel.validateOrThrow() method (RxGatewayStoreModel.java#L663-L691):

private void validateOrThrow(RxDocumentServiceRequest request,
                             HttpResponseStatus status,
                             HttpHeaders headers,
                             ByteBuf retainedBodyAsByteBuf) {

    int statusCode = status.code();

    if (statusCode >= HttpConstants.StatusCodes.MINIMUM_STATUSCODE_AS_ERROR_GATEWAY) {
        // ... error parsing ...
        
        // Always creates generic CosmosException regardless of status code
        CosmosException dce = BridgeInternal.createCosmosException(
            request.requestContext.resourcePhysicalAddress, 
            statusCode, 
            cosmosError, 
            headers.toLowerCaseMap());
        BridgeInternal.setRequestHeaders(dce, request.getHeaders());
        throw dce;
    }
}

In contrast, Direct mode (both TCP/RNTBD and HTTP) uses a switch statement to create appropriate exception types:

Example from RntbdRequestManager:

switch (status.code()) {
    case StatusCodes.BADREQUEST:
        cause = new BadRequestException(error, lsn, partitionKeyRangeId, responseHeaders);
        break;
    case StatusCodes.NOTFOUND:
        cause = new NotFoundException(error, lsn, partitionKeyRangeId, responseHeaders);
        break;
    // ... other status codes properly mapped
}

Exception or Stack Trace

When reading a non-existent document in Gateway mode:

com.azure.cosmos.CosmosException: {"code":"NotFound","message":"Document does not exist..."}
    statusCode = 404

When reading a non-existent document in Direct mode:

com.azure.cosmos.implementation.NotFoundException: {"Errors":["Resource Not Found..."]}
    statusCode = 404

To Reproduce

  1. Create a CosmosClient with Gateway mode enabled:
CosmosClient client = new CosmosClientBuilder()
    .endpoint(endpoint)
    .key(key)
    .gatewayMode()  // Force gateway mode
    .buildClient();
  1. Attempt to read a non-existent document:
try {
    container.readItem("non-existent-id", new PartitionKey("pk"), MyClass.class);
} catch (NotFoundException e) {
    // This catch block is NEVER entered in Gateway mode
    System.out.println("Caught NotFoundException");
} catch (CosmosException e) {
    // This is always caught instead
    System.out.println("Caught generic CosmosException: " + e.getClass().getSimpleName());
    System.out.println("Status code: " + e.getStatusCode());
}

Code Snippet

Minimal reproduction:

import com.azure.cosmos.*;
import com.azure.cosmos.implementation.NotFoundException;
import com.azure.cosmos.models.PartitionKey;

public class GatewayExceptionBugDemo {
    public static void main(String[] args) {
        CosmosClient client = new CosmosClientBuilder()
            .endpoint("<your-endpoint>")
            .key("<your-key>")
            .gatewayMode()
            .buildClient();
        
        CosmosContainer container = client
            .getDatabase("testdb")
            .getContainer("testcontainer");
        
        try {
            container.readItem("non-existent-id", new PartitionKey("pk"), Object.class);
        } catch (NotFoundException e) {
            System.out.println("SUCCESS: Caught NotFoundException");
        } catch (CosmosException e) {
            System.out.println("BUG: Caught " + e.getClass().getSimpleName() + 
                " instead of NotFoundException. Status: " + e.getStatusCode());
        }
        
        client.close();
    }
}

Expected behavior

Gateway mode should return the same specific exception types as Direct mode based on HTTP status codes:

Status Code Expected Exception Type
400 BadRequestException
401 UnauthorizedException
403 ForbiddenException
404 NotFoundException
405 MethodNotAllowedException
409 ConflictException
410 GoneException (or subtypes based on sub-status)
412 PreconditionFailedException
413 RequestEntityTooLargeException
423 LockedException
429 RequestRateTooLargeException
449 RetryWithException
500 InternalServerErrorException
503 ServiceUnavailableException

Proposed Fix

Update RxGatewayStoreModel.validateOrThrow() to use a switch statement similar to RntbdRequestManager and HttpTransportClient:

private void validateOrThrow(RxDocumentServiceRequest request,
                             HttpResponseStatus status,
                             HttpHeaders headers,
                             ByteBuf retainedBodyAsByteBuf) {

    int statusCode = status.code();

    if (statusCode >= HttpConstants.StatusCodes.MINIMUM_STATUSCODE_AS_ERROR_GATEWAY) {
        // ... existing error parsing code ...
        
        CosmosException exception;
        Map<String, String> headersMap = headers.toLowerCaseMap();
        
        switch (statusCode) {
            case HttpConstants.StatusCodes.BADREQUEST:
                exception = new BadRequestException(cosmosError.getMessage(), null, headersMap, null);
                break;
            case HttpConstants.StatusCodes.UNAUTHORIZED:
                exception = new UnauthorizedException(cosmosError.getMessage(), null, headersMap, null);
                break;
            case HttpConstants.StatusCodes.FORBIDDEN:
                exception = new ForbiddenException(cosmosError.getMessage(), null, headersMap, null);
                break;
            case HttpConstants.StatusCodes.NOTFOUND:
                exception = new NotFoundException(cosmosError.getMessage(), null, headersMap, null);
                break;
            // ... other status codes ...
            default:
                exception = BridgeInternal.createCosmosException(
                    request.requestContext.resourcePhysicalAddress, 
                    statusCode, 
                    cosmosError, 
                    headersMap);
                break;
        }
        
        BridgeInternal.setRequestHeaders(exception, request.getHeaders());
        throw exception;
    }
}

Setup:

  • OS: Windows
  • IDE: IntelliJ IDEA (Jetbrains)
  • Library/Libraries: com.azure:azure-cosmos:4.x.x (all versions affected)
  • Java version: 8+
  • App Server/Environment: N/A - Core SDK issue (reproducible in any environment)
  • Frameworks: N/A - Core SDK issue (not framework-specific)

Additional context

This issue was originally surfaced by users of the vNext Cosmos DB Emulator (see azure-cosmos-db-emulator-docker#262), which only supports Gateway mode. However, this is actually an SDK bug that affects all Gateway mode usage, not just the emulator.

The bug has been "hidden" for most users because:

  1. Direct mode is the default connection mode
  2. Most production deployments use Direct mode for performance reasons
  3. The vNext emulator only supporting Gateway mode exposed this inconsistency

Impact

  • Code that catches specific exception types (e.g., NotFoundException) will not work correctly in Gateway mode
  • Users cannot write connection-mode-agnostic exception handling code
  • Breaks behavioral compatibility between Direct and Gateway modes

Related Issues

Information Checklist

  • Bug Description Added
  • Repro Steps Added
  • Setup information Added

Metadata

Metadata

Assignees

No one assigned

    Labels

    ClientThis issue points to a problem in the data-plane of the library.CosmosService AttentionWorkflow: This issue is responsible by Azure service team.customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions