![]() |
A simple voice-to-text tool that lets you transcribe speech to text with a single keyboard shortcut. The transcribed text is automatically pasted wherever your cursor is located.
- 🎯 Single keyboard shortcut activation
- 🔒 Privacy-focused (no cloud services, works offline)
- ⚡ Fast transcription
- 📝 Paste transcribed text directly from clipboard with Ctrl+V
- 🧹 Automatic cleanup of temporary files
- Linux (Fedora/etc.)
- Python 3.6 or higher
- GNOME Desktop Environment (for easy shortcut setup)
- Install system dependencies:
sudo dnf install python3-pip sox xclip xdotool
- Install Vosk through pip:
pip install vosk
- Download and set up the Vosk model:
mkdir -p ~/.local/share/vosk
cd ~/.local/share/vosk
wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zip
mv vosk-model-small-en-us-0.15 model-small-en
- Create the script directory and save the transcription script:
mkdir -p ~/bin
Create ~/bin/quick-transcribe.py
with the following content:
#!/usr/bin/env python3
import json
import wave
import subprocess
import os
import atexit
import signal
from vosk import Model, KaldiRecognizer
# Path for temporary audio file
temp_dir = "/tmp"
audio_file = os.path.join(temp_dir, "recording.wav")
# Ensure cleanup happens in all cases
def cleanup(file_path):
try:
if os.path.exists(file_path):
# Overwrite with zeros before deletion for secure removal
with open(file_path, 'wb') as f:
f.write(b'\x00' * os.path.getsize(file_path))
os.remove(file_path)
except Exception:
pass
# Register cleanup for normal exit and signals
atexit.register(lambda: cleanup(audio_file))
signal.signal(signal.SIGINT, lambda s, f: cleanup(audio_file))
signal.signal(signal.SIGTERM, lambda s, f: cleanup(audio_file))
try:
model_path = os.path.expanduser("~/.local/share/vosk/model-small-en")
# Notify recording start
subprocess.run(["notify-send", "Recording will start in 3 seconds..."])
subprocess.run(["sleep", "3"])
subprocess.run(["notify-send", "Recording... Speak now (Press Ctrl+C to stop)"])
# Record audio
subprocess.run([
"rec",
"-r", "16000",
"-c", "1",
audio_file,
"silence", "1", "0.1", "1%", "1", "2.0", "1%"
])
# Load model and process audio
model = Model(model_path)
wf = wave.open(audio_file, "rb")
rec = KaldiRecognizer(model, 16000)
text = ""
while True:
data = wf.readframes(4000)
if len(data) == 0:
break
if rec.AcceptWaveform(data):
result = json.loads(rec.Result())
text += result.get("text", "") + " "
final_result = json.loads(rec.FinalResult())
text += final_result.get("text", "")
wf.close()
# Copy to clipboard and paste
subprocess.run(["echo", text.strip()], stdout=subprocess.PIPE)
subprocess.run(["xclip", "-selection", "clipboard"], input=text.strip().encode())
subprocess.run(["xdotool", "key", "ctrl+v"])
finally:
# Ensure cleanup happens even if an error occurs
cleanup(audio_file)
- Make the script executable:
chmod +x ~/bin/quick-transcribe.py
- Open GNOME Settings
- Navigate to Keyboard Shortcuts
- Click the + at the bottom
- Add a new custom shortcut:
- Name: Quick Transcribe
- Command:
/home/YOUR_USERNAME/bin/quick-transcribe.py
- Shortcut: Choose something like Ctrl+Alt+R
- Place your cursor where you want the transcribed text to appear
- Press your configured keyboard shortcut (e.g., Ctrl+Alt+R)
- Wait for the "Recording will start in 3 seconds..." notification
- Speak clearly after the "Recording..." notification appears
- Stop speaking for 2 seconds to automatically end recording
- The transcribed text will automatically appear at your cursor location
- All processing is done locally on your machine
- No audio data is sent to external servers
- Temporary audio files are:
- Created in
/tmp/recording.wav
- Securely overwritten before deletion
- Automatically cleaned up, even if the script crashes
- Only exist during the recording and transcription process
- Created in
This project is licensed under the Apache 2 - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.