[DeepSeek] Static default `max_tokens` limits effective context window to 50%

### Reproduction steps

1. Configure Zed to use the DeepSeek provider (e.g., `deepseek-reasoner`).
2. Accumulate a large context in the chat panel (e.g., by adding large files or having a long conversation) such that the input tokens exceed approx. 67,000 tokens (half of the 131,072 limit).
* *Specific example from my case:* My input messages reached **70,142 tokens**.

3. Attempt to send a new message to the model.

### Current vs. Expected behavior

## Current behavior

The request fails immediately with a `400 Bad Request` error from the API. The error message indicates that Zed is requesting a static, large amount for completion (appears to be hardcoded or defaulted to **64,000 tokens**), regardless of the current input length.

> Error: "This model's maximum context length is 131072 tokens. However, you requested 134142 tokens (70142 in the messages, 64000 in the completion)."

This static reservation effectively halves the usable context window of the model. I cannot utilize the full 128k context for input because the client forces a reservation for a 64k output, which causes the sum to exceed the limit.

<img width="799" height="362" alt="Image" src="https://github.com/user-attachments/assets/4c57edf6-d5cb-402a-8e81-08a670b7c64e" />

## Expected behavior

Zed should handle the token limits more gracefully to avoid wasting the context window. I suggest two possible solutions:

### 1. Dynamic Calculation

Calculate the `max_tokens` (completion) dynamically based on the remaining available context.

Logic: `max_tokens = model_context_limit - current_input_tokens - safety_buffer`.

(In my case, requesting ~60,000 tokens instead of 64,000 would have allowed the request to succeed).

### 2. Lower Default Value

Simply lower the default `max_tokens` for DeepSeek. A static reservation of **64k** is excessive for a single response. Reducing this default (e.g., to 8k or 16k) would significantly increase the usable input context without requiring complex calculation logic.


### Zed version and system specs

Zed: 0.217.3+stable.105.80433cb239e868271457ac376673a5f75bc4adb1
OS: Windows 11
Memory: 32 GB
Architecture: x86_64

### Relevant Zed settings

<details><summary>settings.json</summary>

```json
// Zed settings
//
// For information on how to configure Zed, see the Zed
// documentation: https://zed.dev/docs/configuring-zed
//
// To see all of Zed's default settings without changing your
// custom settings, run `zed: open default settings` from the
// command palette (cmd-shift-p / ctrl-shift-p)
{
  "collaboration_panel": {
    "dock": "left",
  },
  "agent": {
    "always_allow_tool_actions": true,
    "default_profile": "write",
    "default_model": {
      "provider": "deepseek",
      "model": "deepseek-reasoner",
    },
    "model_parameters": [],
  },
  "disable_ai": false,
  "terminal": {
    "shell": {
      "program": "nu",
    },
    "font_family": "MesloLGMDZ Nerd Font",
  },
  "telemetry": {
    "diagnostics": true,
    "metrics": true,
  },
  "base_keymap": "VSCode",
  "minimap": {
    "show": "auto",
  },
  "file_types": {},
  "ui_font_size": 16,
  "buffer_font_size": 15,
  "theme": {
    "mode": "system",
    "light": "One Light",
    "dark": "One Dark",
  },
  "language_models": {
    "openai_compatible": {
      "Moonshot": {
        "api_url": "https://api.moonshot.cn/v1",
        "available_models": [
          {
            "name": "kimi-k2-0905-preview",
            "max_tokens": 262144,
            "max_output_tokens": 32000,
            "max_completion_tokens": 200000,
            "capabilities": {
              "tools": true,
              "images": false,
              "parallel_tool_calls": false,
              "prompt_cache_key": false,
            },
          },
          {
            "name": "kimi-k2-turbo-preview",
            "max_tokens": 262144,
            "max_output_tokens": 32000,
            "max_completion_tokens": 200000,
            "capabilities": {
              "tools": true,
              "images": false,
              "parallel_tool_calls": false,
              "prompt_cache_key": false,
            },
          },
          {
            "name": "kimi-k2-thinking",
            "max_tokens": 262144,
            "max_output_tokens": 32000,
            "max_completion_tokens": 200000,
            "capabilities": {
              "tools": true,
              "images": false,
              "parallel_tool_calls": false,
              "prompt_cache_key": false,
            },
          },
          {
            "name": "kimi-k2-thinking-turbo",
            "max_tokens": 262144,
            "max_output_tokens": 32000,
            "max_completion_tokens": 200000,
            "capabilities": {
              "tools": true,
              "images": false,
              "parallel_tool_calls": false,
              "prompt_cache_key": false,
            },
          },
        ],
      },
      "Z.AI": {
        "api_url": "https://open.bigmodel.cn/api/paas/v4/",
        "available_models": [
          {
            "name": "glm-4.6",
            "display_name": "GLM-4.6",
            "max_tokens": 200000,
            "max_output_tokens": 128000,
            "max_completion_tokens": 128000,
            "capabilities": {
              "tools": true,
              "images": false,
              "parallel_tool_calls": true,
              "prompt_cache_key": true,
            },
          },
        ],
      },
    },
  },
}
```

</details>

### (for AI issues) Model provider details

Provider: DeepSeek
Model Name: DeepSeek Reasoner
Mode: Agent Panel

### If you are using WSL on Windows, what flavor of Linux are you using?

None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DeepSeek] Static default `max_tokens` limits effective context window to 50% #45456

Reproduction steps

Current vs. Expected behavior

Current behavior

Expected behavior

1. Dynamic Calculation

2. Lower Default Value

Zed version and system specs

Relevant Zed settings

(for AI issues) Model provider details

If you are using WSL on Windows, what flavor of Linux are you using?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[DeepSeek] Static default max_tokens limits effective context window to 50% #45456

Description

Reproduction steps

Current vs. Expected behavior

Current behavior

Expected behavior

1. Dynamic Calculation

2. Lower Default Value

Zed version and system specs

Relevant Zed settings

(for AI issues) Model provider details

If you are using WSL on Windows, what flavor of Linux are you using?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[DeepSeek] Static default `max_tokens` limits effective context window to 50% #45456