Skip to content

feat: use openrouter's builtin metric #8314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

happyZYM
Copy link
Contributor

What this PR does

Before this PR: 需要手动配置模型价格,并使用本地估计的token数来计量花费

After this PR: 对于Openrouter Provider,直接使用内置的统计功能 (https://openrouter.ai/docs/use-cases/usage-accounting

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

@vaayne
Copy link
Collaborator

vaayne commented Jul 20, 2025

https://openrouter.ai/docs/api-reference/get-a-generation
要获取 OpenRouter 准确的 token 使用信息,需要调用这个接口。Chat API 返回的 token 使用量是通过 gpt-4 tiktoken估算得出,不是精确数据。

@happyZYM
Copy link
Contributor Author

https://openrouter.ai/docs/api-reference/get-a-generation 要获取 OpenRouter 准确的 token 使用信息,需要调用这个接口。Chat API 返回的 token 使用量是通过 gpt-4 tiktoken估算得出,不是精确数据。

文档 https://openrouter.ai/docs/api-reference/overview#querying-cost-and-stats 说:

The token counts that are returned in the completions API response are not counted via the model’s native tokenizer. Instead it uses a normalized, model-agnostic count (accomplished via the GPT4o tokenizer). This is because some providers do not reliably return native token counts. This behavior is becoming more rare, however, and we may add native token counts to the response object in the future.

而文档 https://openrouter.ai/docs/use-cases/usage-accounting 说:

When enabled, the API will return detailed usage information including:

    Prompt and completion token counts using the model’s native tokenizer
    Cost in credits
    Reasoning token counts (if applicable)
    Cached token counts (if available)

This information is included in the last SSE message for streaming responses, or in the complete response for non-streaming requests.

从我目前的实际测试来看(测试了gemini、claude、gpt、qwen等主流模型),确实已经换成了第二个文档里所说的那样,chat completion里面开metric,返回的是真实token count。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants