Elixir client for IBM Cloud Speech to Text service
The package can be installed by adding :ibm_speech_to_text
to your list of dependencies in mix.exs
:
def deps do
[
{:ibm_speech_to_text, "~> 0.3"}
]
end
The docs can be found on hexdocs.pm
-
Start the client process. For that you need to pass API URL or region as an atom, API key obtained from IBM Cloud console.
start_link
also accepts parameters for the endpoint, see the docs for more details.{:ok, pid} = IBMSpeechToText.Client.start_link(:frankfurt, "API_KEY", model: "en-GB_BroadbandModel")
-
Send "start" event with configuration for speech recognition
start_message = %IBMSpeechToText.Start{content_type: :flac} IBMSpeechToText.Client.send_message(pid, start_message)
-
Start audio streaming
IBMSpeechToText.Client.send_data(pid, audio_data)
-
Stop streaming by sending "stop" message
stop_message = %IBMSpeechToText.Stop{} IBMSpeechToText.Client.send_message(pid, stop_message)
-
You will receive results via message with struct
IBMSpeechToText.Response
%IBMSpeechToText.Response{ result_index: 0, results: [ %IBMSpeechToText.RecognitionResult{ alternatives: [ %IBMSpeechToText.RecognitionAlternative{ confidence: 0.87, timestamps: nil, transcript: "to Sherlock Holmes she's always the woman ", word_confidence: nil } ], final: true, keywords_result: nil, word_alternatives: nil }, ... ], speaker_labels: nil, warnings: nil }
Test tagged :external
is excluded by default since it contacts the real API and requires
an API key provided via config.
This can be done by adding config/test.secret.exs
file with the following content:
use Mix.Config
config :ibm_speech_to_text, api_key: "YOUR_API_KEY"
A recording fragment in test/fixtures
comes from an audiobook
"The adventures of Sherlock Holmes (version 2)" available on LibriVox
There are a few things that are not implemented in current version:
- parsing "word_alternatives" and "keywords_result" in RecognitionResult
- better way to pass endpoint options to client
Copyright 2019, Software Mansion
Licensed under the Apache License, Version 2.0