Skip to content

Implement OpenAI-compatible TTS response format with backward compatibility #79

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Copilot
Copy link

@Copilot Copilot AI commented Jul 2, 2025

This PR implements OpenAI-compatible response formats for TTS providers, improving interoperability with tools and applications that expect OpenAI's standardized API responses while maintaining 100% backward compatibility.

🎯 What Changed

New OpenAI-Compatible Response Structure

TTS providers now return structured response objects that match OpenAI's audio.speech format:

{
  "id": "tts-550e8400-e29b-41d4-a716-446655440000",
  "object": "audio.speech", 
  "created": 1677858242,
  "model": "webscout-tts-v1",
  "voice": "alloy",
  "response_format": "mp3",
  "audio_file": "/path/to/generated/audio.mp3",
  "usage": {
    "characters": 42,
    "total_characters": 42
  }
}

Key Features

  • OpenAI API Compatibility: Standard response structure for better tool integration
  • Rich Metadata: Request IDs, timestamps, usage statistics, model information
  • Automatic Format Detection: Audio format detected from file extension
  • 100% Backward Compatibility: str(response) returns audio file path for existing code
  • Configurable: openai_compatible=True/False parameter for response format control
  • JSON Serializable: Full support for API serialization

📝 Usage Examples

OpenAI-Compatible Mode (Default)

from webscout.Provider.TTS import OpenAIFMTTS

tts = OpenAIFMTTS()
response = tts.tts("Hello, OpenAI compatibility!")

# Access rich metadata
print(f"ID: {response.id}")
print(f"Model: {response.model}")  
print(f"Usage: {response.usage.characters} characters")

# Backward compatibility - existing code still works
audio_path = str(response)

Legacy Mode

# Disable OpenAI compatibility for pure string responses
tts = OpenAIFMTTS(openai_compatible=False)
audio_path = tts.tts("Hello world")  # Returns string directly

Zero Breaking Changes

# Existing code continues to work unchanged
audio_file = tts.tts("Your text here")
play_audio(audio_file)  # Still works!

🔧 Technical Implementation

Files Added/Modified

  • Added: webscout/Provider/TTS/openai_utils.py - Core response structures
  • Modified: webscout/Provider/TTS/base.py - Updated base provider with OpenAI support
  • Modified: webscout/Provider/TTS/openai_fm.py - Updated to use new response format
  • Modified: webscout/Provider/TTS/streamElements.py - Updated to use new response format
  • Modified: webscout/Provider/TTS/__init__.py - Export new response classes

Core Components

  • TTSResponse class with OpenAI-compatible structure
  • TTSUsage class for usage tracking
  • create_tts_response() helper function
  • Updated BaseTTSProvider with create_response() method
  • Backward compatibility through __str__() and utility methods

🎁 Benefits

For Developers

  • Better Integration: Works seamlessly with OpenAI-compatible tools and libraries
  • Rich Metadata: Access to request IDs, timestamps, usage stats for monitoring
  • Future-Proof: Follows industry standard response format
  • Zero Migration: Existing code continues to work without changes

For Tools & Applications

  • Standard Interface: Consistent with OpenAI API expectations
  • Enhanced Monitoring: Request tracking and usage analytics
  • Better Debugging: Unique IDs for tracing requests
  • API Compatibility: Easy integration with existing OpenAI toolchains

🧪 Testing

Comprehensive test suite validates:

  • ✅ Response format validation
  • ✅ Provider integration
  • ✅ Backward compatibility
  • ✅ JSON serialization
  • ✅ Error handling
  • ✅ Multiple provider support

🔄 Migration Path

Immediate (No Action Required)

All existing code continues to work without modification.

Optional Enhancement

Developers can gradually adopt new features:

response = tts.tts("Text")
# Use new features when ready
if hasattr(response, 'usage'):
    track_usage(response.usage.characters)

This implementation significantly improves Webscout's TTS capabilities' interoperability while ensuring zero disruption to existing codebases.

Fixes #74.


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

@Copilot Copilot AI changed the title [WIP] Make TTS provider OpenAI-compatible Implement OpenAI-compatible TTS response format with backward compatibility Jul 2, 2025
@Copilot Copilot AI requested a review from OEvortex July 2, 2025 05:43
Copilot finished work on behalf of OEvortex July 2, 2025 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make TTS provider OpenAI-compatible
2 participants