Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(tts): make tts server requests non-blocking, improve clean_for_speech #422

Merged
merged 1 commit into from
Jan 24, 2025

Conversation

ErikBjare
Copy link
Owner

@ErikBjare ErikBjare commented Jan 24, 2025

Important

Make TTS server requests non-blocking and improve text cleaning for speech in tts.py.

  • Behavior:
    • Introduces non-blocking TTS server requests by adding tts_request_queue and tts_processor_thread in tts.py.
    • Enhances clean_for_speech() to remove Italic markup, details in parentheses, and emojis.
  • Threads:
    • Adds tts_processor_thread_func() for processing TTS requests in the background.
    • Updates ensure_threads() to manage both playback and TTS processor threads.
  • Functions:
    • Modifies speak() to queue text chunks for TTS processing and handle blocking mode.
    • Updates stop() to clear both audio and TTS request queues and stop the processor thread.

This description was created by Ellipsis for 947b514. It will automatically update as commits are pushed.

@codecov-commenter
Copy link

codecov-commenter commented Jan 24, 2025

Codecov Report

Attention: Patch coverage is 14.47368% with 65 lines in your changes missing coverage. Please review.

Project coverage is 70.26%. Comparing base (9b4e911) to head (947b514).

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
gptme/tools/tts.py 14.47% 65 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #422      +/-   ##
==========================================
- Coverage   70.49%   70.26%   -0.23%     
==========================================
  Files          70       70              
  Lines        5876     5912      +36     
==========================================
+ Hits         4142     4154      +12     
- Misses       1734     1758      +24     
Flag Coverage Δ
anthropic/claude-3-haiku-20240307 68.36% <14.47%> (-0.36%) ⬇️
deepseek/deepseek-chat 64.03% <14.47%> (-0.21%) ⬇️
openai/gpt-4o-mini 67.62% <14.47%> (-0.23%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 1b3f801 in 1 minute and 23 seconds

More details
  • Looked at 271 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. gptme/tools/tts.py:241
  • Draft comment:
    Removing content within parentheses might strip important information. Consider refining this to only remove specific types of parenthetical content.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The clean_for_speech function removes content within parentheses, which might remove important information. This could be problematic if the text contains necessary details within parentheses.
2. gptme/tools/tts.py:459
  • Draft comment:
    Consider handling cases where clean is False to ensure text is still suitable for speech, or document the expected input format when clean is False.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The speak function does not handle the case where clean is set to False, and the text might contain unwanted content for speech.

Workflow ID: wflow_GVCNEIDkXLp7C9a7


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

clear_queue()
log.info("Stopped TTS playback and cleared queue")
with tts_request_queue.mutex:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearing tts_request_queue using queue.mutex is not thread-safe. Consider using a loop with get_nowait() to safely clear the queue.

…eech to handle **emphasis**, emoji, and `(details)`
@ErikBjare ErikBjare force-pushed the dev/even-more-tts-fixes branch from 1b3f801 to 947b514 Compare January 24, 2025 15:54
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 947b514 in 5 minutes and 6 seconds

More details
  • Looked at 301 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. gptme/tools/tts.py:241
  • Draft comment:
    Removing all content within parentheses might be too aggressive and could remove important information. Consider refining this regex to be more selective.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The clean_for_speech function removes all content within parentheses, which might remove important information. This could be too aggressive for some use cases.
2. gptme/tools/tts.py:72
  • Draft comment:
    Directly accessing tts_request_queue.mutex is not thread-safe and can lead to race conditions. Consider using a loop with get_nowait() to safely clear the queue.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment is incorrect. The Queue class in Python is designed to have its mutex accessed this way for atomic operations. The suggested alternative of using get_nowait() in a loop would actually be less thread-safe as it could miss items added between iterations. The mutex context manager ensures atomic access to clear the queue and notify waiters.
    Could there be edge cases where accessing mutex directly could cause issues? Is there official documentation recommending against this pattern?
    No, this is a well-established pattern. The Queue class's mutex is specifically designed to be used this way for atomic operations. The Python standard library itself uses similar patterns.
    The comment should be deleted as it suggests a less thread-safe alternative to correct code.
3. gptme/tools/tts.py:370
  • Draft comment:
    Consider handling cases where the TTS server returns a non-200 status code without content to improve error logging clarity.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The tts_processor_thread_func function does not handle the case where the TTS server returns a non-200 status code without content, which might lead to unclear error logging.

Workflow ID: wflow_4RkmmU8UZsOxCBbE


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@ErikBjare ErikBjare merged commit 44c3dfe into master Jan 24, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants