-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
update samples version ref to 1.12.1 (#677)
* update samples version ref to 1.12.1 * fix typo in C# speaker recongnition sample. * update samples refferences to 1.12.1 and pull in new speaker recognition samples * retoring the ? in speaker recognition, it wasn't a typo. If the samples repo fails again complaining about this line and character, I'll follow up in a different PR.
- Loading branch information
1 parent
f356a22
commit 5670456
Showing
140 changed files
with
1,626 additions
and
111 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
<?xml version="1.0" encoding="utf-8"?> | ||
<packages> | ||
<package id="Microsoft.CognitiveServices.Speech" version="1.12.0" targetFramework="native" /> | ||
<package id="Microsoft.CognitiveServices.Speech" version="1.12.1" targetFramework="native" /> | ||
</packages> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
quickstart/cpp/windows/from-microphone/helloworld/packages.config
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
<?xml version="1.0" encoding="utf-8"?> | ||
<packages> | ||
<package id="Microsoft.CognitiveServices.Speech" version="1.12.0" targetFramework="native" /> | ||
<package id="Microsoft.CognitiveServices.Speech" version="1.12.1" targetFramework="native" /> | ||
</packages> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
quickstart/cpp/windows/intent-recognition/helloworld/packages.config
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
<?xml version="1.0" encoding="utf-8"?> | ||
<packages> | ||
<package id="Microsoft.CognitiveServices.Speech" version="1.12.0" targetFramework="native" /> | ||
<package id="Microsoft.CognitiveServices.Speech" version="1.12.1" targetFramework="native" /> | ||
</packages> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
quickstart/cpp/windows/multi-device-conversation/helloworld/packages.config
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
<?xml version="1.0" encoding="utf-8"?> | ||
<packages> | ||
<package id="Microsoft.CognitiveServices.Speech" version="1.12.0" targetFramework="native" /> | ||
<package id="Microsoft.CognitiveServices.Speech" version="1.12.1" targetFramework="native" /> | ||
</packages> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# Quickstart: Speaker Recognition using wave file in C++ for Windows | ||
|
||
In this quickstart, you'll learn how to do speaker recognition, including creating a voice profile, enrollment training and speaker verification and identification with C++ using the Speech SDK for Windows. | ||
See the [accompanying article](https://docs.microsoft.com/azure/cognitive-services/speech-service/quickstart-cpp-windows&pivots=programming-language-cpp ) on the SDK documentation page which describes how to build corresponding sample from scratch in Visual Studio 2019. | ||
|
||
## Prerequisites | ||
|
||
* A subscription key for the Speech service. See [Try the speech service for free](https://docs.microsoft.com/azure/cognitive-services/speech-service/get-started). | ||
* [Microsoft Visual Studio 2019](https://www.visualstudio.com/), Community Edition or higher. | ||
* The **Desktop development with C++** workload in Visual Studio and the **NuGet package manager** component in Visual Studio. | ||
You can enable both in **Tools** \> **Get Tools and Features**, under the **Workloads** and **Individual components** tabs, respectively. | ||
|
||
## Build the sample | ||
|
||
* **By building this sample you will download the Microsoft Cognitive Services Speech SDK. By downloading you acknowledge its license, see [Speech SDK license agreement](https://aka.ms/csspeech/license201809).** | ||
* [Download the sample code to your development PC.](/README.md#get-the-samples) | ||
* Start Microsoft Visual Studio 2019 and select **File** \> **Open** \> **Project/Solution**. | ||
* Navigate to the folder containing this sample, and select the solution file contained within it. | ||
* Edit the `helloworld.cpp` source: | ||
* Replace the string `YourSubscriptionKey` with your own subscription key. | ||
* Replace the string `YourServiceRegion` with the service region of your subscription. | ||
* Set the active solution configuration and platform to the desired values under **Build** \> **Configuration Manager**: | ||
* On a 64-bit Windows installation, choose `x64` as active solution platform. | ||
* On a 32-bit Windows installation, choose `x86` as active solution platform. | ||
* Press Ctrl+Shift+B, or select **Build** \> **Build Solution**. | ||
|
||
> **Note** | ||
> If you are seeing red squigglies from IntelliSense for Speech SDK APIs, | ||
> right-click into your editor window and select **Rescan** > **Rescan Solution** to resolve. | ||
## Run the sample | ||
|
||
To debug the app and then run it, press F5 or use **Debug** \> **Start Debugging**. To run the app without debugging, press Ctrl+F5 or use **Debug** \> **Start Without Debugging**. | ||
|
||
## References | ||
|
||
* [Quickstart article on the SDK documentation site](https://docs.microsoft.com/azure/cognitive-services/speech-service/quickstart-cpp-windows) | ||
* [Speech SDK API reference for C++](https://aka.ms/csspeech/cppref) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
|
||
Microsoft Visual Studio Solution File, Format Version 12.00 | ||
# Visual Studio 15 | ||
VisualStudioVersion = 15.0.27428.2011 | ||
MinimumVisualStudioVersion = 10.0.40219.1 | ||
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "helloworld", "helloworld\helloworld.vcxproj", "{6F0FEB3D-1411-4961-9BE0-CA0591077863}" | ||
EndProject | ||
Global | ||
GlobalSection(SolutionConfigurationPlatforms) = preSolution | ||
Debug|x64 = Debug|x64 | ||
Debug|x86 = Debug|x86 | ||
Release|x64 = Release|x64 | ||
Release|x86 = Release|x86 | ||
EndGlobalSection | ||
GlobalSection(ProjectConfigurationPlatforms) = postSolution | ||
{6F0FEB3D-1411-4961-9BE0-CA0591077863}.Debug|x64.ActiveCfg = Debug|x64 | ||
{6F0FEB3D-1411-4961-9BE0-CA0591077863}.Debug|x64.Build.0 = Debug|x64 | ||
{6F0FEB3D-1411-4961-9BE0-CA0591077863}.Debug|x86.ActiveCfg = Debug|Win32 | ||
{6F0FEB3D-1411-4961-9BE0-CA0591077863}.Debug|x86.Build.0 = Debug|Win32 | ||
{6F0FEB3D-1411-4961-9BE0-CA0591077863}.Release|x64.ActiveCfg = Release|x64 | ||
{6F0FEB3D-1411-4961-9BE0-CA0591077863}.Release|x64.Build.0 = Release|x64 | ||
{6F0FEB3D-1411-4961-9BE0-CA0591077863}.Release|x86.ActiveCfg = Release|Win32 | ||
{6F0FEB3D-1411-4961-9BE0-CA0591077863}.Release|x86.Build.0 = Release|Win32 | ||
EndGlobalSection | ||
GlobalSection(SolutionProperties) = preSolution | ||
HideSolutionNode = FALSE | ||
EndGlobalSection | ||
GlobalSection(ExtensibilityGlobals) = postSolution | ||
SolutionGuid = {9D62E0CE-8BF7-49DE-9B17-2FEC5A421F34} | ||
EndGlobalSection | ||
EndGlobal |
Binary file added
BIN
+1.6 MB
quickstart/cpp/windows/speaker-recognition/helloworld/aboutSpeechSdk.wav
Binary file not shown.
143 changes: 143 additions & 0 deletions
143
quickstart/cpp/windows/speaker-recognition/helloworld/helloworld.cpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,143 @@ | ||
// | ||
// Copyright (c) Microsoft. All rights reserved. | ||
// Licensed under the MIT license. See LICENSE.md file in the project root for full license information. | ||
// | ||
|
||
#include "stdafx.h" | ||
// <code> | ||
#include <iostream> | ||
#include <speechapi_cxx.h> | ||
|
||
using namespace std; | ||
using namespace Microsoft::CognitiveServices::Speech; | ||
using namespace Microsoft::CognitiveServices::Speech::Audio; | ||
|
||
void verifySpeaker(const shared_ptr<SpeechConfig>& config, const shared_ptr<VoiceProfile>& profile) | ||
{ | ||
auto speakerRecognizer = SpeakerRecognizer::FromConfig(config, AudioConfig::FromWavFileInput("myVoiceIsMyPassportVerifyMe04.wav")); | ||
auto model = SpeakerVerificationModel::FromProfile(profile); | ||
auto result = speakerRecognizer->RecognizeOnceAsync(model).get(); | ||
if (result->Reason == ResultReason::RecognizedSpeaker) | ||
{ | ||
cout << "Verified voice profile " << result->ProfileId << " score is " << result->GetScore() << endl; | ||
} | ||
else if (result->Reason == ResultReason::Canceled) | ||
{ | ||
auto cancellation = SpeakerRecognitionCancellationDetails::FromResult(result); | ||
cout << "CANCELED " << profile->GetId() << "ErrorCode= " << (int)cancellation->ErrorCode << endl; | ||
cout << "CANCELED " << profile->GetId() << "ErrorDetails= " << cancellation->ErrorDetails << endl; | ||
} | ||
} | ||
|
||
void speakerVerification() | ||
{ | ||
// Creates an instance of a speech config with specified subscription key and service region. | ||
// Replace with your own subscription key and service region (e.g., "westus"). | ||
auto config = SpeechConfig::FromSubscription("YourSubscriptionKey", "YourServiceRegion"); | ||
|
||
// Creates a VoiceProfileClient to enroll your voice profile. | ||
auto client = VoiceProfileClient::FromConfig(config); | ||
|
||
// Creates a text dependent voice profile in one of the supported locales using the client. | ||
auto profile = client->CreateProfileAsync(VoiceProfileType::TextDependentVerification, "en-us").get(); | ||
auto trainingFiles = vector<string>{ "myVoiceIsMyPassportVerifyMe01.wav", "myVoiceIsMyPassportVerifyMe02.wav", "myVoiceIsMyPassportVerifyMe03.wav" }; | ||
for (auto& trainingFile : trainingFiles) | ||
{ | ||
auto audioInput = AudioConfig::FromWavFileInput(trainingFile); | ||
auto result = client->EnrollProfileAsync(profile, audioInput).get(); | ||
if (result->Reason == ResultReason::EnrollingVoiceProfile) | ||
{ | ||
cout << "Enrolling profile id " << profile->GetId() << endl; | ||
} | ||
else if (result->Reason == ResultReason::EnrolledVoiceProfile) | ||
{ | ||
cout << "Enrolled profile id " << profile->GetId() << endl; | ||
verifySpeaker(config, profile); | ||
break; | ||
} | ||
else if (result->Reason == ResultReason::Canceled) | ||
{ | ||
auto cancellation = VoiceProfileEnrollmentCancellationDetails::FromResult(result); | ||
cout << "CANCELED " << profile->GetId() << "ErrorCode= " << (int)cancellation->ErrorCode << endl; | ||
break; | ||
} | ||
cout << "Number of enrollment audios accepted for " << profile->GetId() << " is "<< result->GetEnrollmentInfo(EnrollmentInfoType::EnrollmentsCount) << endl; | ||
cout << "Number of enrollment audios needed to complete " << profile->GetId() << " is " << result->GetEnrollmentInfo(EnrollmentInfoType::RemainingEnrollmentsCount) << endl; | ||
} | ||
|
||
if (!profile->GetId().empty()) | ||
{ | ||
client->DeleteProfileAsync(profile).get(); | ||
} | ||
} | ||
|
||
void identifySpeakers(const shared_ptr<SpeechConfig>& config, const vector<shared_ptr<VoiceProfile>>& profiles) | ||
{ | ||
auto speakerRecognizer = SpeakerRecognizer::FromConfig(config, AudioConfig::FromWavFileInput("wikipediaOcelot.wav")); | ||
auto model = SpeakerIdentificationModel::FromProfiles(profiles); | ||
auto result = speakerRecognizer->RecognizeOnceAsync(model).get(); | ||
if (result->Reason == ResultReason::RecognizedSpeakers) | ||
{ | ||
cout << "The most similiar voice profile is " << result->ProfileId << " with similiarity score " << result->GetScore() << endl; | ||
auto raw = result->Properties.GetProperty(PropertyId::SpeechServiceResponse_JsonResult); | ||
cout << "The raw json from the service is " << raw << endl; | ||
} | ||
} | ||
|
||
void speakerIdentification() | ||
{ | ||
// Creates an instance of a speech config with specified subscription key and service region. | ||
// Replace with your own subscription key and service region (e.g., "westus"). | ||
auto config = SpeechConfig::FromSubscription("YourSubscriptionKey", "YourServiceRegion"); | ||
|
||
// Creates a VoiceProfileClient to enroll your voice profile. | ||
auto client = VoiceProfileClient::FromConfig(config); | ||
|
||
// Creates two text independent voice profiles in one of the supported locales. | ||
auto profile1 = client->CreateProfileAsync(VoiceProfileType::TextIndependentIdentification, "en-us").get(); | ||
auto profile2 = client->CreateProfileAsync(VoiceProfileType::TextIndependentIdentification, "en-us").get(); | ||
cout << "Created profiles " << profile1->GetId() << " and " << profile2->GetId() << " for text independent identification." << endl; | ||
|
||
// Enroll the two profiles | ||
auto result1 = client->EnrollProfileAsync(profile1, AudioConfig::FromWavFileInput("aboutSpeechSdk.wav")).get(); | ||
cout << "Enrolled profile " << profile1->GetId() << endl; | ||
auto result2 = client->EnrollProfileAsync(profile2, AudioConfig::FromWavFileInput("speechService.wav")).get(); | ||
cout << "Enrolled profile " << profile2->GetId() << endl; | ||
|
||
// Identify the two profiles after successful enrollments. | ||
if (result1->Reason == ResultReason::EnrolledVoiceProfile && result2->Reason == ResultReason::EnrolledVoiceProfile) | ||
{ | ||
vector<shared_ptr<VoiceProfile>> profiles{ profile1, profile2 }; | ||
identifySpeakers(config, profiles); | ||
} | ||
|
||
// delete the two profiles after we are done. | ||
if (!profile1->GetId().empty()) | ||
{ | ||
client->DeleteProfileAsync(profile1).get(); | ||
} | ||
if (!profile2->GetId().empty()) | ||
{ | ||
client->DeleteProfileAsync(profile2).get(); | ||
} | ||
} | ||
|
||
int main() | ||
{ | ||
try | ||
{ | ||
cout << "Speaker Verification:"; | ||
speakerVerification(); | ||
|
||
cout << "\nSpeaker Identification:"; | ||
speakerIdentification(); | ||
} | ||
catch (const exception& e) | ||
{ | ||
cout << e.what(); | ||
} | ||
cout << "Please press a key to continue.\n"; | ||
cin.get(); | ||
return 0; | ||
} | ||
// </code> |
Oops, something went wrong.