Skip to content

Implement accurate token counting#57

Closed
OEvortex wants to merge 1 commit intomainfrom
codex/add-token-count-to-stream-and-non-stream
Closed

Implement accurate token counting#57
OEvortex wants to merge 1 commit intomainfrom
codex/add-token-count-to-stream-and-non-stream

Conversation

@OEvortex
Copy link
Owner

Summary

  • import count_tokens from utils across providers
  • use count_tokens to compute prompt and completion tokens for AI providers

Testing

  • ruff check .
  • pytest -q (fails: command not found)

@OEvortex OEvortex requested a review from Copilot May 21, 2025 08:35
@OEvortex OEvortex closed this May 21, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR integrates a new count_tokens helper to replace rough len(...split()) and division-based token estimates across all OpenAI provider implementations.

  • Imported count_tokens in each provider module
  • Replaced imprecise len(...).split() or // 4 token estimations with count_tokens(...)
  • Ensured both prompt and completion token counts use the shared utility

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
webscout/Provider/OPENAI/wisecat.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/venice.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/typegpt.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/typefully.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/textpollinations.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/standardinput.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/scirachat.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/netwrck.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/heckai.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/exachat.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/exaai.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/chatgptclone.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/chatgpt.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/c4ai.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/ai4chat.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/FreeGemini.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/Cloudflare.py Imported count_tokens and updated token logic
webscout/Provider/OPENAI/BLACKBOXAI.py Imported count_tokens and updated token logic
Comments suppressed due to low confidence (2)

webscout/Provider/OPENAI/wisecat.py:12

  • The token counting logic is now duplicated across every provider module. Consider moving the prompt/completion token accumulation into a shared method on the base provider class to avoid repetition and simplify future updates.
from .utils import ( ... ChatCompletionMessage, CompletionUsage, count_tokens

webscout/Provider/OPENAI/utils.py:1

  • There are no existing unit tests for count_tokens. Adding tests that validate its behavior on edge cases (empty strings, long texts, unicode) will help ensure accuracy.
def count_tokens(text: str) -> int:

Comment on lines 105 to +107
# Estimate prompt tokens based on message length
for msg in payload.get("messages", []):
prompt_tokens += len(msg.get("content", "").split())
prompt_tokens += count_tokens(msg.get("content", ""))
Copy link

Copilot AI May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Invoking count_tokens for each message in a streaming loop may introduce performance overhead. You might batch messages or cache intermediate values to reduce repeated computations.

Copilot uses AI. Check for mistakes.
@OEvortex OEvortex deleted the codex/add-token-count-to-stream-and-non-stream branch December 1, 2025 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants