Azure-docs: Using custom audio input other than default microphone.

Created on 10 Oct 2018  Â·  11Comments  Â·  Source: MicrosoftDocs/azure-docs

Hey there!

I've been struggling for an afternoon to allow the user to select the audio input to use with AudioConfig (https://docs.microsoft.com/en-us/javascript/api/microsoft-cognitiveservices-speech-sdk/audioconfig?view=azure-node-latest), but I just can't figure it out. I read through the sdk code as well as the docs to track down an idea of what's expected/how to do it (I assume AudioConfig.fromStreamInput is the right way), but to no avail.

I tried using the getUserMedia web api to obtain a stream (which is of a specific audio input), and then passing that stream to AudioConfig.fromStreamInput, but that didn't work.

I also tried using createMediaStreamSource (of the Web Audio API) to convert the stream from getUserMedia to a MediaStreamSourceNode, but that didn't work either.

How do I get a stream from another source in the format "AudioInputStream" like the documentation requests? Or, am I going about this completely wrong?

Thanks!


Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

cognitive-servicesvc cxp product-question triaged

All 11 comments

Thanks for the feedback! We are currently investigating and will update you shortly.

Hi @jpetitte,

please use

  static function fromDefaultMicrophoneInput()

to create an audio configuration that uses the default microphone of the users browser.
This requires the user to approve use of the microphone the first time this function is called (in case you serve this over https) or each time the function is called (in case you server over http or from a local file).

If the user declines to accept, an error should be raised so your code can handle the decline of the audio permission.

The other two functions are useful in scenarios where the user wants to upload an existing audio file. In this case, the file must be recorded in 16000 samples/sec, 16 bits/sample, 1channel (mono). Any other format will currently raise an error.

hope that helps.

kind regards,
friedel

Thanks for the reply @fmegen!

Just to clarify to make sure I understand correctly: there is no way to select the audio input source? It will always have to be the default microphone?

Hi @jpetitte,

To be clear, there are TWO ways to specify the audio source.

  1. Use AudioConfig.fromDefaultMicrophoneInput() This will access your microphone.

  2. Use AudioConfig.fromWavFileInput() This will use the File() you opened (in a browser, this is usually some upload)

a third option is AudioConfig.fromStreamInput(), for custom streams.

In any case, the only supported configuration is 16000Samples/sec, 16bits/sample, 1channel (mono)

@fmegen,

I see. That's what I was afraid of. Is there a github repo for this SDK (I found one for the old SDK)?

After reading through the webpack bundle for a few hours, I think I could make a method that allows the user to select the audio input source and process it to the expected format. I think I can do this just by copying the fromDefaultMicrophoneInput method (rename it to a new method), adding an option to input the ID of the audiosource (as defined by enumerateDevices), then using that selection when calling get userMedia, such as line 3642 in the bundle.

I could do this with the webpack bundle, but after something has been webpacked, it's much tougher.

This is functionality that I will need (I have no choice unfortunately), so I'll have to build something one way or another. It would be cool if I can add it so everyone can use it.

Hi @jpetitte,
thanks a lot for your offer.

(@wolfma61) there are plans to opensource the library but currently there is date decided on when.

@jpetitte Thanks a lot for your input. We will now proceed to close this thread. If there are further questions regarding this matter, please respond here and @YutongTie-MSFT and we will gladly continue the discussion.

@jpetitte Thanks for your response.

@YutongTie-MSFT

I went ahead and modified the current browser sdk webpack bundle to use a custom input. I'm sure you guys already know how to do it, but if you're interested in seeing the changes I made, let me know. I don't mind having a hack to make things work for us for now, but having something officially supported will be good long term.

@jpetitte I know its late. I am trying to use mediaStream object with audioConfig for 2 days. I finally found this thread. Can you please help me out? I am not very familiar with the codebase.

@mitanshu001 no worries. I should be able to get to this tomorrow.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

spottedmahn picture spottedmahn  Â·  3Comments

bdcoder2 picture bdcoder2  Â·  3Comments

spottedmahn picture spottedmahn  Â·  3Comments

mrdfuse picture mrdfuse  Â·  3Comments

bityob picture bityob  Â·  3Comments