Skip to content

Conversation

@usnavy13
Copy link
Contributor

@usnavy13 usnavy13 commented Dec 3, 2025

Summary

This PR adds support for running Anthropic Claude models through Google Cloud Vertex AI, addressing the feature request in #5995.

Background:

  • Claude models are available on Google Cloud Vertex AI with near-identical API functionality to the direct Anthropic API
  • The current implementation creates a generic Google client which doesn't support function calls, proper token limits, or streaming
  • This has been a requested feature for ~9 months with community interest

Building upon:

Key improvements in this PR:

  • YAML-based configuration (similar to Azure OpenAI) for better flexibility and deployment customization
  • Improved error handling with specific error messages for credential issues
  • Full web search support with proper header handling for Vertex AI
  • Prompt caching support via automatic header filtering for Vertex AI compatibility

Configuration Options

Option 1: YAML Configuration (Recommended)

# librechat.yaml
endpoints:
  anthropic:
    streamRate: 20
    vertex:
      enabled: true
      region: "us-east5"  # Optional, defaults to 'us-east5'
      projectId: "${GOOGLE_PROJECT_ID}"  # Optional, auto-detected from service key
      serviceKeyFile: "/path/to/service-account.json"  # Optional

Option 2: Environment Variables

ANTHROPIC_USE_VERTEX=true
ANTHROPIC_VERTEX_REGION=us-east5
GOOGLE_SERVICE_KEY_FILE=/path/to/service-account.json

Supports all Anthropic Claude models available on Vertex AI.

Credits

Closes #5995


Note on Documentation

If this looks like the direction you'd like to go, I'm happy to submit a corresponding documentation PR to librechat.ai. Just didn't want to get ahead of review.


Change Type

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Testing

  1. Configure Vertex AI in librechat.yaml:
    endpoints:
      anthropic:
        vertex:
          enabled: true
          region: "us-east5"
  2. Place Google service account key at api/data/auth.json (or configure path via serviceKeyFile)
  3. Start LibreChat and select an Anthropic model

Tested scenarios:

  • Text conversations with streaming
  • Image input processing
  • Web search functionality
  • Function/tool calling via Agents

Checklist

  • My code adheres to this project's style guidelines
  • I have performed a self-review of my own code
  • I have commented in any complex areas of my code
  • I have made pertinent documentation changes
  • My changes do not introduce new warnings
  • I have written tests demonstrating that my changes are effective or that my feature works
  • Local unit tests pass with my changes
  • Any changes dependent on mine have been merged and published in downstream modules.
  • A pull request for updating the documentation has been submitted.

Ziyann and others added 10 commits November 23, 2025 15:02
- Support both OpenAI format (input_token_details) and Anthropic format (cache_*_input_tokens) for token usage tracking

- Filter out unsupported anthropic-beta header values for Vertex AI (prompt-caching, max-tokens, output-128k, token-efficient-tools, context-1m)
- Introduced configuration options for running Anthropic models via Google Cloud Vertex AI in the YAML file.
- Updated ModelService to prioritize Vertex AI models from the configuration.
- Enhanced endpoint configuration to enable Anthropic endpoint when Vertex AI is configured.
- Implemented validation and processing for Vertex AI credentials and options.
- Added new types and schemas for Vertex AI configuration in the data provider.
- Created utility functions for loading and validating Vertex AI credentials and configurations.
- Updated various services to integrate Vertex AI options into the Anthropic client setup.
…ation

- Updated the `getLLMConfig` function to throw a specific error message when credentials are missing, enhancing clarity for users.
- Refactored the `parseCredentials` function to handle plain API key strings more gracefully, returning them wrapped in an object if JSON parsing fails.
- Updated the `setOptions` method in `AgentClient` to use a parameter name for clarity.
- Refactored error handling in `loadDefaultModels` for better readability.
- Removed unnecessary blank lines in `initialize.js`, `endpoints.ts`, and `vertex.ts` to streamline the code.
- Enhanced formatting in `validateVertexConfig` for improved consistency and clarity.
@usnavy13 usnavy13 marked this pull request as ready for review December 3, 2025 02:23
@Ziyann Ziyann mentioned this pull request Dec 3, 2025
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants