AI: A single prompt alone consume almost all of the 128k tokens of gpt5/sonnet4 #37274

UdittLamba · 2025-08-31T18:10:57Z

UdittLamba
Aug 31, 2025

Summary

Previously, a single 128k context lasted for multiple prompts, but now it is a single response for a single prompt that consumes most of the tokens. Continuing to a new thread also doesn't;t help, as the "new from summary" is bugged out as well

Description

Steps to trigger the problem:

Select sonnet 4 or gpt 5
Enter a prompt
Often token consumption would hit 100k plus within the return of first response

Expected Behavior: single context should last for multiple complex prompts
Actual Behavior: it barely lasts a single prompt after the last update

Model Provider Details

Provider: Anthropic via API key, Copilot Chat
Model Name: Sonnet 4, GPT 5
Mode: Agent Panel
Other Details (MCPs, other settings, etc): none

Zed Version and System Specs

Zed: v0.201.8 (Zed)
OS: Linux Wayland manjaro unknown
Memory: 31.1 GiB
Architecture: x86_64
GPU: AMD Radeon RX 7900 XTX (RADV NAVI31) || radv || Mesa 25.2.1-arch1.4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AI: A single prompt alone consume almost all of the 128k tokens of gpt5/sonnet4 #37274

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

AI: A single prompt alone consume almost all of the 128k tokens of gpt5/sonnet4 #37274

Uh oh!

UdittLamba Aug 31, 2025

Summary

Description

Model Provider Details

Zed Version and System Specs

Replies: 0 comments

UdittLamba
Aug 31, 2025