Azure-docs: Can not retrieve Trained VoiceProfile from Azure CognitiveServices

Created on 21 Sep 2020  Â·  8Comments  Â·  Source: MicrosoftDocs/azure-docs

[Enter feedback here]
I do not see anywhere in all the documentation how we can use our trained enrolled identification users for later identification. Even if I save the GUID's of the voiceprofiles once they are created I can not get the voiceprofiles from the Azure service in order to perform verification on the users identity. The examples provided are no good for the real world. People can't do voice training every single time they start an application. How can I do a verification on one enrolled user after voice training without having to Create a brand new profile every time.
https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-speaker-recognition

public async Task VerificationEnroll(SpeechConfig config, Dictionary profileMapping)
{

        var client = new VoiceProfileClient(config);

        _voiceProfile = await client.CreateProfileAsync(VoiceProfileType.TextIndependentVerification, "en-us");

        var audioInput = AudioConfig.FromDefaultMicrophoneInput();

        CurrentInstructions.Text = $"Enrolling profile id {_voiceProfile.Id}.";
        // give the profile a human-readable display name

                _profileMapping.Add(_voiceProfile.Id, "Person's Name");

                VoiceProfileEnrollmentResult result = null;
                while (result is null || result.RemainingEnrollmentsSpeechLength > TimeSpan.Zero)
                {
                    CurrentInstructions.Text = "Continue speaking to add to the profile enrollment sample.";
                    result = await client.EnrollProfileAsync(_voiceProfile, audioInput);
                    CurrentInstructions.Text = $"Remaining enrollment audio time needed: {result.RemainingEnrollmentsSpeechLength}";

                }

                if (result.Reason == ResultReason.EnrolledVoiceProfile)
                {
                    await SpeakerVerify(config, _voiceProfile, _profileMapping);
                }
                else if (result.Reason == ResultReason.Canceled)
                {
                    var cancellation = VoiceProfileEnrollmentCancellationDetails.FromResult(result);
                    CurrentInstructions.Text = $"CANCELED {_voiceProfile.Id}: ErrorCode={cancellation.ErrorCode} ErrorDetails={cancellation.ErrorDetails}";
                }

    }

public async Task SpeakerVerify(SpeechConfig config, VoiceProfile profile, Dictionary profileMapping)
{
try
{
var speakerRecognizer = new SpeakerRecognizer(config, AudioConfig.FromDefaultMicrophoneInput());
var users = speakerRecognizer.Properties.GetProperty(PropertyId.SpeechServiceResponse_JsonResult);
var model = SpeakerVerificationModel.FromProfile(profile);

            CurrentInstructions.Text = "Speak the passphrase to verify: \"My voice is my passport, please verify me.\"";
            TimeSpan.FromSeconds(0);
            var result = await speakerRecognizer.RecognizeOnceAsync(model);
            CurrentInstructions.Text = $"Verified voice profile for speaker {profileMapping[result.ProfileId]}, score is {result.Score}";
        }
        catch(Exception ex)
        {
           // throw ex;
        }

}

Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

Pri2 cognitive-servicesvc cxp product-question speech-servicsubsvc triaged

All 8 comments

Thanks for the feedback! We are currently investigating and will update you shortly.

Thanks very much! We are currently looking to replace our current Speaker Recognition software solution and would like to use Microsoft's Cognitive Services for this. We are eagerly looking forward to how to do this. This is the example we started with to enroll and then verify. We are saving the profiles guid locally but can't get the voiceprofile to verify without having to voice train every single time. https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues

One would imagine the VoiceProfileClient class would have a Read function given it seems to be a CRUD wrapper. Oddly 'read' is replaced with Reset whatever that does...

Having everyone re-enroll everytime renders this useless like @roninstar stated, but that's such a glaring omission of functionality I can't imagine that's the case (unless it's some annoying PII/privacy thing)! I'm going to guess you've placed it somewhere else and just not documented it well enough for us to find it.

If it is not possible to retrieve the profiles in Azure (which isn't good as I cannot verify that my database mapping of voiceProfileIDs to Users matches what is in azure) then I'd suggest the SpeakerIdentificationModel.FromProfiles() function have a parameter-less overload as well, in which case Azure would internally match the audio against ALL it's profiles (wildcard) when RecognizeOnceAsync is called. Honestly that is my use-case and preferred way of calling it anyway, I don't see myself filtering down the list of profiles when I try to identify the user (though I still want a way to list all the profiles saved in Azure for other reasons).

I'm also blocked on speaker identification until this is resolved.

@roninstar @jkatebin Thanks for the information and explanation. I have forwarded this to product team for more investigation. Sorry for the delay.

@jkatebin @roninstar Thank you guys for the feedback. I have confirmed that this is a limitation in SDK, but RESTful API is supporting this function. We are working on fixing this in SDK. Thanks for the feedback again.

We will now proceed to close this thread. If there are further questions regarding this matter, please respond here and @YutongTie-MSFT and we will gladly continue the discussion.

could you provide us with a link to the REST API that we can use for this function? thanks for following up and letting us know it will be added to the SDK in the future. best regards,
Mark

@roninstar check out https://docs.microsoft.com/en-us/rest/api/speakerrecognition/identification/textindependent/listprofiles

Below is a quick snippet I threw together in C# if it helps. Needs a lot before being ready for production of course, but like you I plan to use the SDK so this was just for testing until it gets wrapped in by Microsoft.

  public async Task<IEnumerable<VoiceProfile>> GetUserVoiceProfilesFromAzure()
        {
            // This will probably be added to the SDK, they recognize it's missing.  Use the SDK once it's included

            var endpoint = $"{_speechSettings.Endpoint}/speaker/identification/v2.0/text-independent/profiles";


            using (var requestMessage = new HttpRequestMessage(HttpMethod.Get, endpoint))
            {
                requestMessage.Headers.Add("Ocp-Apim-Subscription-Key", _speechSettings.Key);

                var response = await _httpClient.SendAsync(requestMessage);
                if (response.IsSuccessStatusCode)
                {
                    var result = await response.Content.ReadAsStringAsync();
                    return JObject.Parse(result)["profiles"].Select(p => new VoiceProfile(p["profileId"].Value<string>()) { }).AsEnumerable();
                }
            }

            return Enumerable.Empty<VoiceProfile>();
        }

thanks very much! this makes sense

Was this page helpful?
0 / 5 - 0 ratings