Skip to content

Commit 4ef3813

Browse files
committed
Merge remote-tracking branch 'vocodedev/main'
2 parents 9e5a207 + b74c85b commit 4ef3813

File tree

257 files changed

+18699
-10855
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

257 files changed

+18699
-10855
lines changed

.github/workflows/test.yml

-2
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,6 @@ jobs:
1515
fail-fast: false
1616
matrix:
1717
python-version:
18-
- "3.8"
19-
- "3.9"
2018
- "3.10"
2119
- "3.11"
2220
poetry-version:

Makefile

+6
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@ transcribe:
99
synthesize:
1010
poetry run python playground/streaming/synthesizer/synthesize.py
1111

12+
turn_based_conversation:
13+
poetry run python quickstarts/turn_based_conversation.py
14+
15+
streaming_conversation:
16+
poetry run python quickstarts/streaming_conversation.py
17+
1218
PYTHON_FILES=.
1319
lint: PYTHON_FILES=vocode/ quickstarts/ playground/
1420
lint_diff typecheck_diff: PYTHON_FILES=$(shell git diff --name-only --diff-filter=d main | grep -E '\.py$$')

README.md

+59-44
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,11 @@
22

33
![Hero](https://user-images.githubusercontent.com/6234599/228337850-e32bb01d-3701-47ef-a433-3221c9e0e56e.png)
44

5-
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/vocodehq.svg?style=social&label=Follow%20%40vocodehq)](https://twitter.com/vocodehq) [![GitHub Repo stars](https://img.shields.io/github/stars/vocodedev/vocode-python?style=social)](https://github.com/vocodedev/vocode-python)
5+
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/vocodehq.svg?style=social&label=Follow%20%40vocodehq)](https://twitter.com/vocodehq) [![GitHub Repo stars](https://img.shields.io/github/stars/vocodedev/vocode-core?style=social)](https://github.com/vocodedev/vocode-core)
6+
[![pypi](https://img.shields.io/pypi/v/vocode.svg)](https://pypi.python.org/pypi/vocode)
67
[![Downloads](https://static.pepy.tech/badge/vocode/month)](https://pepy.tech/project/vocode)
78

8-
[Community](https://discord.gg/NaU4mMgcnC) | [Docs](https://docs.vocode.dev) | [Dashboard](https://app.vocode.dev)
9+
[Community](https://discord.gg/NaU4mMgcnC) | [Docs](https://docs.vocode.dev/open-source) | [Dashboard](https://app.vocode.dev)
910

1011
</div>
1112

@@ -19,11 +20,11 @@ We're actively looking for community maintainers, so please reach out if interes
1920

2021
# ⭐️ Features
2122

22-
- 🗣 [Spin up a conversation with your system audio](https://docs.vocode.dev/python-quickstart)
23-
- ➡️ 📞 [Set up a phone number that responds with a LLM-based agent](https://docs.vocode.dev/telephony#inbound-calls)
24-
- 📞 ➡️ [Send out phone calls from your phone number managed by an LLM-based agent](https://docs.vocode.dev/telephony#outbound-calls)
25-
- 🧑‍💻 [Dial into a Zoom call](https://github.com/vocodedev/vocode-python/blob/main/vocode/streaming/telephony/hosted/zoom_dial_in.py)
26-
- 🤖 [Use an outbound call to a real phone number in a Langchain agent](https://docs.vocode.dev/langchain-agent)
23+
- 🗣 [Spin up a conversation with your system audio](https://docs.vocode.dev/open-source/python-quickstart)
24+
- ➡️ 📞 [Set up a phone number that responds with a LLM-based agent](https://docs.vocode.dev/open-source/telephony#inbound-calls)
25+
- 📞 ➡️ [Send out phone calls from your phone number managed by an LLM-based agent](https://docs.vocode.dev/telephony/open-source/#outbound-calls)
26+
- 🧑‍💻 [Dial into a Zoom call](https://github.com/vocodedev/vocode-core/blob/53b01dab0b59f71961ee83dbcaf3653a6935c2e3/vocode/streaming/telephony/conversation/zoom_dial_in.py)
27+
- 🤖 [Use an outbound call to a real phone number in a Langchain agent](https://docs.vocode.dev/open-source/langchain-agent)
2728
- Out of the box integrations with:
2829
- Transcription services, including:
2930
- [AssemblyAI](https://www.assemblyai.com/)
@@ -34,19 +35,16 @@ We're actively looking for community maintainers, so please reach out if interes
3435
- [RevAI](https://www.rev.ai/)
3536
- [Whisper](https://openai.com/blog/introducing-chatgpt-and-whisper-apis)
3637
- [Whisper.cpp](https://github.com/ggerganov/whisper.cpp)
37-
3838
- LLMs, including:
39-
- [ChatGPT](https://openai.com/blog/chatgpt)
40-
- [GPT-4](https://platform.openai.com/docs/models/gpt-4)
39+
- [OpenAI](https://platform.openai.com/docs/models)
4140
- [Anthropic](https://www.anthropic.com/)
42-
- [GPT4All](https://github.com/nomic-ai/gpt4all)
4341
- Synthesis services, including:
4442
- [Rime.ai](https://rime.ai)
4543
- [Microsoft Azure](https://azure.microsoft.com/en-us/products/cognitive-services/text-to-speech/)
4644
- [Google Cloud](https://cloud.google.com/text-to-speech)
4745
- [Play.ht](https://play.ht)
4846
- [Eleven Labs](https://elevenlabs.io/)
49-
- [Coqui](https://coqui.ai/)
47+
- [Cartesia](https://cartesia.ai/)
5048
- [Coqui (OSS)](https://github.com/coqui-ai/TTS)
5149
- [gTTS](https://gtts.readthedocs.io/)
5250
- [StreamElements](https://streamelements.com/)
@@ -59,45 +57,63 @@ Check out our React SDK [here](https://github.com/vocodedev/vocode-react-sdk)!
5957

6058
We're an open source project and are extremely open to contributors adding new features, integrations, and documentation! Please don't hesitate to reach out and get started building with us.
6159

62-
For more information on contributing, see our [Contribution Guide](https://github.com/vocodedev/vocode-python/blob/main/contributing.md).
60+
For more information on contributing, see our [Contribution Guide](https://github.com/vocodedev/vocode-core/blob/main/contributing.md).
6361

64-
And check out our [Roadmap](https://github.com/vocodedev/vocode-python/blob/main/roadmap.md).
62+
And check out our [Roadmap](https://github.com/vocodedev/vocode-core/blob/main/roadmap.md).
6563

6664
We'd love to talk to you on [Discord](https://discord.gg/NaU4mMgcnC) about new ideas and contributing!
6765

6866
# 🚀 Quickstart
6967

7068
```bash
71-
pip install 'vocode'
69+
pip install vocode
7270
```
7371

7472
```python
7573
import asyncio
76-
import logging
7774
import signal
78-
from vocode.streaming.streaming_conversation import StreamingConversation
75+
76+
from pydantic_settings import BaseSettings, SettingsConfigDict
77+
7978
from vocode.helpers import create_streaming_microphone_input_and_speaker_output
80-
from vocode.streaming.transcriber import *
81-
from vocode.streaming.agent import *
82-
from vocode.streaming.synthesizer import *
83-
from vocode.streaming.models.transcriber import *
84-
from vocode.streaming.models.agent import *
85-
from vocode.streaming.models.synthesizer import *
79+
from vocode.logging import configure_pretty_logging
80+
from vocode.streaming.agent.chat_gpt_agent import ChatGPTAgent
81+
from vocode.streaming.models.agent import ChatGPTAgentConfig
8682
from vocode.streaming.models.message import BaseMessage
87-
import vocode
88-
89-
# these can also be set as environment variables
90-
vocode.setenv(
91-
OPENAI_API_KEY="<your OpenAI key>",
92-
DEEPGRAM_API_KEY="<your Deepgram key>",
93-
AZURE_SPEECH_KEY="<your Azure key>",
94-
AZURE_SPEECH_REGION="<your Azure region>",
83+
from vocode.streaming.models.synthesizer import AzureSynthesizerConfig
84+
from vocode.streaming.models.transcriber import (
85+
DeepgramTranscriberConfig,
86+
PunctuationEndpointingConfig,
9587
)
88+
from vocode.streaming.streaming_conversation import StreamingConversation
89+
from vocode.streaming.synthesizer.azure_synthesizer import AzureSynthesizer
90+
from vocode.streaming.transcriber.deepgram_transcriber import DeepgramTranscriber
9691

92+
configure_pretty_logging()
9793

98-
logging.basicConfig()
99-
logger = logging.getLogger(__name__)
100-
logger.setLevel(logging.DEBUG)
94+
95+
class Settings(BaseSettings):
96+
"""
97+
Settings for the streaming conversation quickstart.
98+
These parameters can be configured with environment variables.
99+
"""
100+
101+
openai_api_key: str = "ENTER_YOUR_OPENAI_API_KEY_HERE"
102+
azure_speech_key: str = "ENTER_YOUR_AZURE_KEY_HERE"
103+
deepgram_api_key: str = "ENTER_YOUR_DEEPGRAM_API_KEY_HERE"
104+
105+
azure_speech_region: str = "eastus"
106+
107+
# This means a .env file can be used to overload these settings
108+
# ex: "OPENAI_API_KEY=my_key" will set openai_api_key over the default above
109+
model_config = SettingsConfigDict(
110+
env_file=".env",
111+
env_file_encoding="utf-8",
112+
extra="ignore",
113+
)
114+
115+
116+
settings = Settings()
101117

102118

103119
async def main():
@@ -106,8 +122,6 @@ async def main():
106122
speaker_output,
107123
) = create_streaming_microphone_input_and_speaker_output(
108124
use_default_devices=False,
109-
logger=logger,
110-
use_blocking_speaker_output=True
111125
)
112126

113127
conversation = StreamingConversation(
@@ -116,24 +130,25 @@ async def main():
116130
DeepgramTranscriberConfig.from_input_device(
117131
microphone_input,
118132
endpointing_config=PunctuationEndpointingConfig(),
119-
)
133+
api_key=settings.deepgram_api_key,
134+
),
120135
),
121136
agent=ChatGPTAgent(
122137
ChatGPTAgentConfig(
138+
openai_api_key=settings.openai_api_key,
123139
initial_message=BaseMessage(text="What up"),
124140
prompt_preamble="""The AI is having a pleasant conversation about life""",
125141
)
126142
),
127143
synthesizer=AzureSynthesizer(
128-
AzureSynthesizerConfig.from_output_device(speaker_output)
144+
AzureSynthesizerConfig.from_output_device(speaker_output),
145+
azure_speech_key=settings.azure_speech_key,
146+
azure_speech_region=settings.azure_speech_region,
129147
),
130-
logger=logger,
131148
)
132149
await conversation.start()
133150
print("Conversation started, press Ctrl+C to end")
134-
signal.signal(
135-
signal.SIGINT, lambda _0, _1: asyncio.create_task(conversation.terminate())
136-
)
151+
signal.signal(signal.SIGINT, lambda _0, _1: asyncio.create_task(conversation.terminate()))
137152
while conversation.is_active():
138153
chunk = await microphone_input.get_audio()
139154
conversation.receive_audio(chunk)
@@ -145,8 +160,8 @@ if __name__ == "__main__":
145160

146161
# 📞 Phone call quickstarts
147162

148-
- [Telephony Server - Self-hosted](https://docs.vocode.dev/telephony)
163+
- [Telephony Server - Self-hosted](https://docs.vocode.dev/open-source/telephony)
149164

150165
# 🌱 Documentation
151166

152-
[docs.vocode.dev](https://docs.vocode.dev/)
167+
[docs.vocode.dev](https://docs.vocode.dev/open-source)

apps/client_backend/Dockerfile

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM python:3.9-bullseye
1+
FROM python:3.11-bullseye
22

33
# get portaudio and ffmpeg
44
RUN apt-get update \
@@ -15,4 +15,4 @@ RUN poetry config virtualenvs.create false
1515
RUN poetry install --no-dev --no-interaction --no-ansi
1616
COPY main.py /code/main.py
1717

18-
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "3000"]
18+
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "3000"]

apps/client_backend/main.py

+6-11
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,19 @@
1-
import logging
1+
from dotenv import load_dotenv
22
from fastapi import FastAPI
33

4-
from vocode.streaming.models.agent import ChatGPTAgentConfig
5-
from vocode.streaming.models.synthesizer import AzureSynthesizerConfig
6-
from vocode.streaming.synthesizer.azure_synthesizer import AzureSynthesizer
7-
4+
from vocode.logging import configure_pretty_logging
85
from vocode.streaming.agent.chat_gpt_agent import ChatGPTAgent
96
from vocode.streaming.client_backend.conversation import ConversationRouter
7+
from vocode.streaming.models.agent import ChatGPTAgentConfig
108
from vocode.streaming.models.message import BaseMessage
11-
12-
from dotenv import load_dotenv
9+
from vocode.streaming.models.synthesizer import AzureSynthesizerConfig
10+
from vocode.streaming.synthesizer.azure_synthesizer import AzureSynthesizer
1311

1412
load_dotenv()
1513

1614
app = FastAPI(docs_url=None)
1715

18-
logging.basicConfig()
19-
logger = logging.getLogger(__name__)
20-
logger.setLevel(logging.DEBUG)
16+
configure_pretty_logging()
2117

2218
conversation_router = ConversationRouter(
2319
agent_thunk=lambda: ChatGPTAgent(
@@ -31,7 +27,6 @@
3127
output_audio_config, voice_name="en-US-SteffanNeural"
3228
)
3329
),
34-
logger=logger,
3530
)
3631

3732
app.include_router(conversation_router.get_router())

0 commit comments

Comments
 (0)