Skip to content

Releases: LlamaEdge/whisper-api-server

LlamaEdge-Whisper 0.3.2

12 Nov 09:16
Compare
Choose a tag to compare

Major changes:

  • New endpoints

    • GET /v1/files/{file_id}: Retrieve information of a specific file by id
    • DELETE /v1/files/{file_id}: Remove a specific file by id
  • Upgrade to llama-core v0.22.0

LlamaEdge-Whisper 0.3.1

06 Nov 08:48
Compare
Choose a tag to compare

Major changes:

  • New CLI options:

    • threads: Number of threads to use during computation. Defaults to 4.
    • processors: Number of processors to use during computation. Defaults to 1.
    • task: Task type. Default to full. Possible values: transcribe, translate, full
    • port: Port number. Defaults to 8080.
  • Support new fields of transcription requests

    • language: The language of the input audio. Defaults to en.
    • temperature: Sampling temperature, between 0 and 1. Defaults to 0.00.
    • prompt: Text to guide the model's style or continue a previous audio segment. Defaults to none
    • max_len: Maximum number of tokens that the model can generate in a single transcription segment (or chunk). Defaults to 0.
    • split_on_word: Split audio chunks on word rather than on token. Defaults to false.
    • detect_language: Automatically detect the spoken language in the provided audio input. Defaults to false.
    • offset_time: Time offset in milliseconds. Defaults to 0.
    • duration: Length of audio (in seconds) to be processed starting from the point defined by the offset_time field (or from the beginning by default). Defaults to 0.
  • Support new fields for translation requests

    • detect_language: Automatically detect the spoken language in the provided audio input. Defaults to false.
    • offset_time: Time offset in milliseconds. Defaults to 0.
    • duration: Length of audio (in seconds) to be processed starting from the point defined by the offset_time field (or from the beginning by default). Defaults to 0.
    • max_len: Maximum number of tokens that the model can generate in a single transcription segment (or chunk). Defaults to 0.
    • split_on_word: Split audio chunks on word rather than on token. Defaults to false.

LlamaEdge-Whisper 0.3.0

23 Sep 11:31
Compare
Choose a tag to compare

Major change:

  • Remove the support for /v1/audio/speech endpoint. The endpoint will be supported in the coming tts-api-server.

LlamaEdge-Whisper 0.2.2

20 Sep 02:38
Compare
Choose a tag to compare

Major change:

  • Migrate to WasmEdge v0.14.1

LlamaEdge-Whisper 0.2.1

14 Sep 05:53
Compare
Choose a tag to compare
Pre-release

Major changes:

  • /v1/audio/translations endpoint for translating audio into English text.
  • Integrate wasi_nn-piper plugin

LlamaEdge-Whisper 0.2.0

26 Aug 13:26
Compare
Choose a tag to compare

Major features:

  • /v1/audio/transcriptions endpoint for transcribing audio into the input language.