|
| 1 | +--- |
| 2 | +title: Configure Against Multiple Providers |
| 3 | +description: How to mix and match models to optimize cost and performance |
| 4 | +--- |
| 5 | + |
| 6 | +The current state of GenerativeAI is a sprawl of vendors offering products with different specialties and price points, and its common to have an optimal AI setup involve usage of models across multiple different vendors or multiple models within the same vendor. Plural provides a number of knobs that are designed to make that degree of customization seamless, and compatible w/in a GitOps workflow. |
| 7 | + |
| 8 | +## Provider Selection within Plural AI |
| 9 | + |
| 10 | +There are three main usecases where we can differentiate models: |
| 11 | + |
| 12 | +1. High-volume insight inference - usually a smaller, cheaper model is the right choice here (gpt-4.1-mini for instance) |
| 13 | +2. On-demand agentic, tool-calling - this is used in Plural AI chat, pr generation, and similar features within the platform. Here a user very likely will want to use a capable model like Anthropic Claude Sonnet 4.5. |
| 14 | +3. Embedding - Plural also consistently vector embeds your infra data derived from GitOps and data scraping. Its possible you'll need to configure a dedicated embedding model provider if the base provider doesn't support it. |
| 15 | + |
| 16 | +Often toggling these individual can give you the best cost/feature tradeoff for your usecase. |
| 17 | + |
| 18 | +## Example Configuration |
| 19 | + |
| 20 | +To tune your AI configuration, the recommended approach is to do it within a GitOps workflow using our `DeploymentSettings` Kubernetes CRD. If you set up Plural with `plural up`, this will already be defined for you at `bootstrap/settings.yaml`. Here's a basically complete example of how to configure its AI model settings: |
| 21 | + |
| 22 | + |
| 23 | +```yaml |
| 24 | +apiVersion: deployments.plural.sh/v1alpha1 |
| 25 | +kind: DeploymentSettings |
| 26 | +metadata: |
| 27 | + name: global |
| 28 | + namespace: plrl-deploy-operator # this namespace is required |
| 29 | +spec: |
| 30 | + ai: |
| 31 | + enabled: true |
| 32 | + provider: OPENAI # OPENAI gpt-4.1-mini for low-cost, high-volume |
| 33 | + embeddingProvider: OPENAI # openai has an embedding model built-in |
| 34 | + toolProvider: ANTHROPIC # anthropic for complex tool calling |
| 35 | + |
| 36 | + # example configurations of the various different AI providers supported, its not necessary to |
| 37 | + # configure all of them |
| 38 | + anthropic: |
| 39 | + model: claude-sonnet-4-5 |
| 40 | + tokenSecretRef: |
| 41 | + name: ai-config |
| 42 | + key: anthropic |
| 43 | + |
| 44 | + openAI: |
| 45 | + tokenSecretRef: |
| 46 | + name: ai-config |
| 47 | + key: openai |
| 48 | + model: gpt-4.1-mini |
| 49 | + toolModel: gpt-4.1 |
| 50 | + |
| 51 | + azure: |
| 52 | + tokenSecretRef: |
| 53 | + name: ai-config |
| 54 | + key: azure |
| 55 | + endpoint: https://plural-openai.openai.azure.com/openai/deployments |
| 56 | + apiVersion: '2024-10-21' |
| 57 | + |
| 58 | + vertex: |
| 59 | + project: pluralsh-test-384515 |
| 60 | + location: us-east5 |
| 61 | + model: anthropic/claude-sonnet-4-5 |
| 62 | + serviceAccountJsonSecretRef: |
| 63 | + name: ai-config |
| 64 | + key: vertex |
| 65 | +``` |
| 66 | +
|
| 67 | +{% callout severity="info" %} |
| 68 | +All the secretRef's below reference a kubernetes secret defined like: |
| 69 | +
|
| 70 | +```yaml |
| 71 | +apiVersion: v1 |
| 72 | +kind: Secret |
| 73 | +metadata: |
| 74 | + namespace: plrl-deploy-operator |
| 75 | + name: ai-config |
| 76 | +stringData: |
| 77 | + openai: ... |
| 78 | + anthropic: ... |
| 79 | + azure: ... |
| 80 | + vertex: ... |
| 81 | +``` |
| 82 | +{% /callout %} |
| 83 | +
|
| 84 | +## Learn More |
| 85 | +
|
| 86 | +You can see the full docs for this resource at https://docs.plural.sh/overview/management-api-reference#deploymentsettingsspec |
0 commit comments