You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Wow! Well done, spot on, etc. This worked perfect on a MacBook Pro M3 after doing this:
install:
pipx install insanely-fast-whisper
PYTORCH_ENABLE_MPS_FALLBACK=1
yields: output.txt so the last step is:
paragraphs, but how ?
Now, insanely perfect would be paragraphs ... they don't even have to be semantically/grammatically correct.
For now, I send output.txt to ChatGPT 4o or Gemini 1.5 Pro with this prompt:
"without changing any of the words, please make paragraphs where appropriate, just for human readibility to this text:"
... which also works great, but it would be nice to do this locally.
I tested with 30+ minute and 1+ hour videos (which transcribed in 4 mins 20 secs = wow).
Thanks ... fingers crossed for paragraphs too.
The text was updated successfully, but these errors were encountered:
I guees I could add something like this to convert_output.py ...
import re
import textwrap
def add_paragraphs(transcribed_text, sentences_per_paragraph=5, line_width=70):
# normalize spaces
transcribed_text = re.sub(r'\s+', ' ', transcribed_text).strip()
# split the text into sentences, keeping the punctuation with the sentence
sentence_endings = re.compile(r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?|\!)\s')
sentences = sentence_endings.split(transcribed_text)
paragraphs = []
for i in range(0, len(sentences), sentences_per_paragraph):
paragraph = ' '.join(sentences[i:i + sentences_per_paragraph]).strip()
wrapped_paragraph = textwrap.fill(paragraph, width=line_width)
paragraphs.append(wrapped_paragraph)
formatted_text = '\n\n'.join(paragraphs)
# ensure the last sentence has punctuation
if not re.search(r'[.!?]$', formatted_text.strip()):
formatted_text += '.'
return formatted_text
with open('output.txt', 'r') as file:
transcribed_text = file.read()
formatted_text = add_paragraphs(transcribed_text, sentences_per_paragraph=3, line_width=70)
with open('formatted_output.txt', 'w') as file:
file.write(formatted_text)
print("Formatted text with paragraphs and line wrapping has been saved to 'formatted_output.txt'.")
Wow! Well done, spot on, etc. This worked perfect on a MacBook Pro M3 after doing this:
install:
pipx install insanely-fast-whisper
PYTORCH_ENABLE_MPS_FALLBACK=1
insanely-fast-whisper --file-name audio.mp3 --device-id mps --model-name openai/whisper-large-v3 --batch-size 4
Since insanely-fast-whisper only does audio:
do a youtube video about whatever, or read a book/chapter then:
pip install pytube pydub
from pytube import YouTube
from pydub import AudioSegment
video_url = "https://youtu.be/wGdHxSIYEIo?si=1r9krMIwerPusbZW"
yt = YouTube(video_url)
audio_stream = yt.streams.filter(only_audio=True).first()
audio_file = audio_stream.download(filename='audio')
audio = AudioSegment.from_file(audio_file)
audio.export("audio.mp3", format="mp3")
yields: audio.mp3 so then do:
insanely-fast-whisper --file-name audio.mp3 --device-id mps --model-name openai/whisper-large-v3 --batch-size 4
that yields: output.json so then do:
python -B convert_output.py output.json -f txt -o .
... see:
https://github.com/Vaibhavs10/insanely-fast-whisper/blob/main/convert_output.py
yields: output.txt so the last step is:
paragraphs, but how ?
Now, insanely perfect would be paragraphs ... they don't even have to be semantically/grammatically correct.
For now, I send output.txt to ChatGPT 4o or Gemini 1.5 Pro with this prompt:
"without changing any of the words, please make paragraphs where appropriate, just for human readibility to this text:"
... which also works great, but it would be nice to do this locally.
I tested with 30+ minute and 1+ hour videos (which transcribed in 4 mins 20 secs = wow).
Thanks ... fingers crossed for paragraphs too.
The text was updated successfully, but these errors were encountered: