-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Description
Reproduction steps
- Configure Zed to use the DeepSeek provider (e.g.,
deepseek-reasoner). - Accumulate a large context in the chat panel (e.g., by adding large files or having a long conversation) such that the input tokens exceed approx. 67,000 tokens (half of the 131,072 limit).
- Specific example from my case: My input messages reached 70,142 tokens.
- Attempt to send a new message to the model.
Current vs. Expected behavior
Current behavior
The request fails immediately with a 400 Bad Request error from the API. The error message indicates that Zed is requesting a static, large amount for completion (appears to be hardcoded or defaulted to 64,000 tokens), regardless of the current input length.
Error: "This model's maximum context length is 131072 tokens. However, you requested 134142 tokens (70142 in the messages, 64000 in the completion)."
This static reservation effectively halves the usable context window of the model. I cannot utilize the full 128k context for input because the client forces a reservation for a 64k output, which causes the sum to exceed the limit.
Expected behavior
Zed should handle the token limits more gracefully to avoid wasting the context window. I suggest two possible solutions:
1. Dynamic Calculation
Calculate the max_tokens (completion) dynamically based on the remaining available context.
Logic: max_tokens = model_context_limit - current_input_tokens - safety_buffer.
(In my case, requesting ~60,000 tokens instead of 64,000 would have allowed the request to succeed).
2. Lower Default Value
Simply lower the default max_tokens for DeepSeek. A static reservation of 64k is excessive for a single response. Reducing this default (e.g., to 8k or 16k) would significantly increase the usable input context without requiring complex calculation logic.
Zed version and system specs
Zed: 0.217.3+stable.105.80433cb239e868271457ac376673a5f75bc4adb1
OS: Windows 11
Memory: 32 GB
Architecture: x86_64
Relevant Zed settings
settings.json
// Zed settings
//
// For information on how to configure Zed, see the Zed
// documentation: https://zed.dev/docs/configuring-zed
//
// To see all of Zed's default settings without changing your
// custom settings, run `zed: open default settings` from the
// command palette (cmd-shift-p / ctrl-shift-p)
{
"collaboration_panel": {
"dock": "left",
},
"agent": {
"always_allow_tool_actions": true,
"default_profile": "write",
"default_model": {
"provider": "deepseek",
"model": "deepseek-reasoner",
},
"model_parameters": [],
},
"disable_ai": false,
"terminal": {
"shell": {
"program": "nu",
},
"font_family": "MesloLGMDZ Nerd Font",
},
"telemetry": {
"diagnostics": true,
"metrics": true,
},
"base_keymap": "VSCode",
"minimap": {
"show": "auto",
},
"file_types": {},
"ui_font_size": 16,
"buffer_font_size": 15,
"theme": {
"mode": "system",
"light": "One Light",
"dark": "One Dark",
},
"language_models": {
"openai_compatible": {
"Moonshot": {
"api_url": "https://api.moonshot.cn/v1",
"available_models": [
{
"name": "kimi-k2-0905-preview",
"max_tokens": 262144,
"max_output_tokens": 32000,
"max_completion_tokens": 200000,
"capabilities": {
"tools": true,
"images": false,
"parallel_tool_calls": false,
"prompt_cache_key": false,
},
},
{
"name": "kimi-k2-turbo-preview",
"max_tokens": 262144,
"max_output_tokens": 32000,
"max_completion_tokens": 200000,
"capabilities": {
"tools": true,
"images": false,
"parallel_tool_calls": false,
"prompt_cache_key": false,
},
},
{
"name": "kimi-k2-thinking",
"max_tokens": 262144,
"max_output_tokens": 32000,
"max_completion_tokens": 200000,
"capabilities": {
"tools": true,
"images": false,
"parallel_tool_calls": false,
"prompt_cache_key": false,
},
},
{
"name": "kimi-k2-thinking-turbo",
"max_tokens": 262144,
"max_output_tokens": 32000,
"max_completion_tokens": 200000,
"capabilities": {
"tools": true,
"images": false,
"parallel_tool_calls": false,
"prompt_cache_key": false,
},
},
],
},
"Z.AI": {
"api_url": "https://open.bigmodel.cn/api/paas/v4/",
"available_models": [
{
"name": "glm-4.6",
"display_name": "GLM-4.6",
"max_tokens": 200000,
"max_output_tokens": 128000,
"max_completion_tokens": 128000,
"capabilities": {
"tools": true,
"images": false,
"parallel_tool_calls": true,
"prompt_cache_key": true,
},
},
],
},
},
},
}(for AI issues) Model provider details
Provider: DeepSeek
Model Name: DeepSeek Reasoner
Mode: Agent Panel
If you are using WSL on Windows, what flavor of Linux are you using?
None