Skip to content

Repo for SmartSpeak, the final deliverable of Dialogue Systems (COSC-4463).

Notifications You must be signed in to change notification settings

billsponsor/smart_speak

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

SmartSpeak: A Unified System for Transcription and Conversation

Repo for the final deliverable of Dialogue Systems (COSC-4463).

This paper introduces SmartSpeak, a dialogue system designed to enhance productivity in corporate meetings by integrating transcription, conversational synthesis, and real-time question-answering capabilities. Leveraging advancements in large language models and multimodal conversational attributes derived from linguistic research, SmartSpeak aims to address critical challenges in dialogue systems, including grounding, turn-taking, and speaker recognition. OpenAI’s Whisper and Resemblyzer are utilized for speech transcription and speaker identification, while LLMs process transcriptions to deliver contextually relevant responses. Through the implementation and evaluation of SmartSpeak, this study demonstrates the potential of integrating linguistic principles with AI technologies to create systems that effectively augment human conversations. Key findings reveal both the limitations of current transcription models in real-world scenarios and the promise of LLMs to generate actionable insights. The paper concludes with directions for future research, including multimodal enhancements, system optimization for conversational flow, and fine-tuning on domain-specific corpora to further bridge the gap between human dialogue and AI systems.

About

Repo for SmartSpeak, the final deliverable of Dialogue Systems (COSC-4463).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages