SmartSpeak: A Unified System for Transcription and Conversation

Repo for the final deliverable of Dialogue Systems (COSC-4463).

This paper introduces SmartSpeak, a dialogue system designed to enhance productivity in corporate meetings by integrating transcription, conversational synthesis, and real-time question-answering capabilities. Leveraging advancements in large language models and multimodal conversational attributes derived from linguistic research, SmartSpeak aims to address critical challenges in dialogue systems, including grounding, turn-taking, and speaker recognition. OpenAI’s Whisper and Resemblyzer are utilized for speech transcription and speaker identification, while LLMs process transcriptions to deliver contextually relevant responses. Through the implementation and evaluation of SmartSpeak, this study demonstrates the potential of integrating linguistic principles with AI technologies to create systems that effectively augment human conversations. Key findings reveal both the limitations of current transcription models in real-world scenarios and the promise of LLMs to generate actionable insights. The paper concludes with directions for future research, including multimodal enhancements, system optimization for conversational flow, and fine-tuning on domain-specific corpora to further bridge the gap between human dialogue and AI systems.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SmartSpeak: A Unified System for Transcription and Conversation

About

Releases

Packages

Languages

billsponsor/smart_speak

Folders and files

Latest commit

History

Repository files navigation

SmartSpeak: A Unified System for Transcription and Conversation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages