STT Service WebUI

A Docker-based STT (Automatic Speech Recognition) system using faster-whisper with a Gradio web UI.

Repository Structure

stt-service/         # FastAPI service with faster‑whisper + CTranslate2
  ├── Dockerfile     # Builds CUDA/cuDNN image, installs dependencies, converts model
  └── app.py         # FastAPI app: /healthz (instant), /transcribe (lazy model load + auth)

webui-service/       # Gradio front-end
  ├── Dockerfile     # Builds Slim Python image with Python libs and ffmpeg
  ├── requirements.txt
  └── app.py         # Gradio UI with basic auth, sends X-API-KEY header

docker-compose.yml   # Defines services, ports, and simple depends_on
README.md            # This documentation

Features

faster‑whisper with CTranslate2 for ~2× real-time transcription on GPU
Lazy loading: model loads on first request (20–30 s), then stays in memory
Token‑based auth: secure /transcribe with X-API-KEY
UI Basic Auth: protect Gradio interface with username/password
Multi‑format support: WAV, MP3, OGG/Opus, M4A, FLAC, AMR, etc.
Health check: /healthz returns OK immediately

Quick Start

1. Clone & backup

git clone https://github.com/nek1987/stt-service-webui.git
cd stt-service-webui
# (optional) git archive -o backup.tar HEAD

2. Define environment variables

In your .env or directly in docker-compose.yml:

services:
  stt-service:
    environment:
      - MODEL_PATH=/models/islomov_navaistt_v2_medium_ct2
      - NVIDIA_VISIBLE_DEVICES=0
      # Multiple API tokens (comma-separated, no spaces)
      - API_TOKENS=${API_TOKENS}     # e.g., API_TOKENS=token-alpha,token-bravo,token-charlie
      # Single token (backwards compatible)
      - API_TOKEN=${API_TOKEN}       # e.g., API_TOKEN=single-token-key

  webui-service:
    environment:
      - STT_API=http://stt-service:5085/transcribe
      - API_TOKEN=${API_TOKEN}       # webui uses single token
      - UI_USER=${UI_LOGIN}
      - UI_PASS=${UI_PASS}

In your .env file:

# API authentication (choose one method)
API_TOKENS=token-alpha,token-bravo,token-charlie  # Multiple tokens
# OR
API_TOKEN=single-token-key                         # Single token

# UI authentication
UI_LOGIN=admin
UI_PASS=s3cret

3. Build & run

docker-compose down --rmi local    # optional: remove old images
docker-compose build               # build both services
docker-compose up -d               # start in detached mode

4. Check services

# STT service health
curl http://localhost:5085/healthz
# → {"status":"ok"}

# Gradio UI
open http://localhost:7860
# Will prompt for user/pass (UI_USER/UI_PASS)

Service Configuration

stt-service

Port: 5085
Endpoints:
- GET /healthz → {"status":"ok"}
- POST /transcribe (multipart file@, header X-API-KEY)
Auth: include X-API-KEY whose value matches one of the configured tokens. Use API_TOKENS=token1,token2 (comma-separated list) for multiple keys, or API_TOKEN for a single legacy token.
Model path: baked in /models/islomov_navaistt_v1_medium_ct2
Lazy load: model initializes on first /transcribe

Configuring API tokens

Multiple tokens: set API_TOKENS to a comma-separated list without spaces, e.g. API_TOKENS=service-a-key,service-b-key,service-c-key.
Single token (backwards compatible): define API_TOKEN=service-a-key. The value is automatically combined with any tokens in API_TOKENS.
No tokens: omit both variables to leave the /transcribe endpoint open (not recommended for production deployments).

Token validation:

Minimum length: 8 characters (shorter tokens are rejected with a warning)
Whitespace is automatically trimmed
Duplicate tokens are automatically removed
On startup, all configured tokens are logged (masked for security, e.g., token-***pha)

Token usage logging:

Each authenticated request logs the masked token used (e.g., token-***pha)
Failed authentication attempts are logged with the invalid token (masked)
No API key provided: logged as "No API key provided"

webui-service

Port: 7860
Auth: basic HTTP auth (username/UI_USER, password/UI_PASS)
UI → Service: sends X-API-KEY header automatically

API Reference

Health Check

GET http://<host>:5085/healthz
→ 200 OK {"status":"ok"}

Transcribe Audio

POST http://<host>:5085/transcribe
Headers:
  X-API-KEY: token-alpha
  Accept: application/json
Body:
  multipart/form-data, field "file" = audio file

Response 200 OK:

{ "text": "transcribed text here" }

Errors:

401 Unauthorized if missing/invalid token
400 Bad Request if no file
500 Internal Server Error on model/load failures

Notes & Tips

Increase GPU concurrency by running multiple instances behind a load balancer.
To support streaming partial results, integrate model.transcribe(..., stream=True).
Tune beam_size and compute_type in app.py for quality vs. speed.

Author

Jamshid Radjabov — Telecom expert and AI Enthusias .

Pull requests and issues are welcome!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

STT Service WebUI

Repository Structure

Features

Quick Start

1. Clone & backup

2. Define environment variables

3. Build & run

4. Check services

Service Configuration

stt-service

Configuring API tokens

webui-service

API Reference

Health Check

Transcribe Audio

Notes & Tips

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
stt-service		stt-service
webui-service		webui-service
README.md		README.md
docker-compose.yml		docker-compose.yml

nek1987/stt-service-webui

Folders and files

Latest commit

History

Repository files navigation

STT Service WebUI

Repository Structure

Features

Quick Start

1. Clone & backup

2. Define environment variables

3. Build & run

4. Check services

Service Configuration

stt-service

Configuring API tokens

webui-service

API Reference

Health Check

Transcribe Audio

Notes & Tips

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages