doc(README): update README

ultrasev · Apr 23, 2024 · 7cff9dd · 7cff9dd
1 parent a711fdd
commit 7cff9dd
Showing 1 changed file with 35 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -10,7 +10,10 @@
 
 
 # 使用方法
-## 服务端
+## 1. 拆分服务端与客户端
+适合 GPU 在云端的场景。
+### 服务端
+负责接收客户端发送的音频数据，进行语音识别，然后把识别结果返回给客户端。
 ```bash
 git clone https://github.com/ultrasev/stream-whisper
 apt -y install libcublas11
@@ -26,7 +29,9 @@ pip3 install -r requirements.txt
 第一次执行时，会从 huggingface 上下载语音识别模型，需要等待一段时间。Huggingface 已经被防火墙特别对待了，下载速度很慢，建议使用代理。
 
 
-## 客户端
+### 客户端
+负责录音，然后把音频数据发送给服务端，接收服务端返回的识别结果。
+
 ```bash
 git clone https://github.com/ultrasev/stream-whisper
 apt -y install portaudio19-dev
@@ -39,6 +44,34 @@ pip3 install -r requirements.txt
 
 同样需要把 `.env` 文件中的 `REDIS_SERVER` 改成自己的 Redis 地址，在本地机器上运行 `python3 -m src.client`，客户端就启动了。运行前先测试一下麦克风是否正常工作，确认能够正常录音。
 
+## 2. 本地直接运行
+如果本地有 GPU，可以直接运行 `src/local_deploy.py`，这样就可以在本地直接运行服务端和客户端了。
+```bash
+git clone https://github.com/ultrasev/stream-whisper
+apt -y install portaudio19-dev  libcublas11
+python3 src/local_deploy.py
+```
+
+
+# Docker 一键部署自己的 whisper 转写服务
+```bash
+docker run -d --name whisper -p 8000:8000 ghcr.io/ultrasev/whisper
+```
+接口兼容 OpenAI 的 [API 规范](https://platform.openai.com/docs/guides/speech-to-text)，可以直接使用 OpenAI 的 SDK 进行调用。
+
+```python
+from openai import OpenAI
+client = OpenAI(base_url="http://localhost:8000")
+
+audio_file= open("/path/to/file/audio.mp3", "rb")
+transcription = client.audio.transcriptions.create(
+  model="whisper-1",
+  file=audio_file
+)
+print(transcription.text)
+```
+
+
 # 可优化方向
 1. 缩短静音间隔，提高实时性。默认静音间隔是 0.5 秒，可以根据自己的需求在 `client.py` 中调整。
 2. 使用更好的语音识别模型，提高识别准确率。