Releases · LlamaEdge/whisper-api-server

Major changes:

New CLI options:
- threads: Number of threads to use during computation. Defaults to 4.
- processors: Number of processors to use during computation. Defaults to 1.
- task: Task type. Default to full. Possible values: transcribe, translate, full
- port: Port number. Defaults to 8080.
Support new fields of transcription requests
- language: The language of the input audio. Defaults to en.
- temperature: Sampling temperature, between 0 and 1. Defaults to 0.00.
- prompt: Text to guide the model's style or continue a previous audio segment. Defaults to none
- max_len: Maximum number of tokens that the model can generate in a single transcription segment (or chunk). Defaults to 0.
- split_on_word: Split audio chunks on word rather than on token. Defaults to false.
- detect_language: Automatically detect the spoken language in the provided audio input. Defaults to false.
- offset_time: Time offset in milliseconds. Defaults to 0.
- duration: Length of audio (in seconds) to be processed starting from the point defined by the offset_time field (or from the beginning by default). Defaults to 0.
Support new fields for translation requests
- detect_language: Automatically detect the spoken language in the provided audio input. Defaults to false.
- offset_time: Time offset in milliseconds. Defaults to 0.
- duration: Length of audio (in seconds) to be processed starting from the point defined by the offset_time field (or from the beginning by default). Defaults to 0.
- max_len: Maximum number of tokens that the model can generate in a single transcription segment (or chunk). Defaults to 0.
- split_on_word: Split audio chunks on word rather than on token. Defaults to false.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: LlamaEdge/whisper-api-server

LlamaEdge-Whisper 0.3.2

LlamaEdge-Whisper 0.3.1

LlamaEdge-Whisper 0.3.0

LlamaEdge-Whisper 0.2.2

LlamaEdge-Whisper 0.2.1

LlamaEdge-Whisper 0.2.0