Mixedrealitytoolkit-unity: App Stalls/Delays at Startup Due to Noisy Environment

Created on 21 Feb 2019 · 11Comments · Source: microsoft/MixedRealityToolkit-Unity

Overview

I took an app built with MRTK to a very noisy factory environment and it took > 5m to open after the display of the splash screen. I found that apps like '3D Viewer' built (AFAIK) without the MRTK didn't seem to suffer from this problem.

It should be noted that the app uses voice commands.

Expected Behavior

The apps would open in a reasonable time.

Actual Behavior

The apps don't open in a reasonable time - e.g. > 5m from splash screen.

Steps to reproduce

Take an app built with the toolkit using voice commands and run it up in an extremely noisy environment. You should find that it hangs for minutes after the splash screen & if you stop it in the debugger you'll find that it is stuck somewhere spinning around trying to deal with speech. I haven't yet diagnosed whether this is caused by MRTK code or by UWP code or some combination.

Unity Editor Version

2018.3.2f1

Mixed Reality Toolkit Release Version

Not 100% sure - think it's the latest release at the time of writing - 2017.4.3.0.

Bug D365 External Speech

Source

mtaulty

👍3

Most helpful comment

I have the same exact issue. The exact line that is causing this for me is:
"keywordRecognizer = new KeywordRecognizer(keywords.Keys.ToArray());"

It just gets stuck there if there's noise (managed to 100% reproduce this by playing music while starting the app).

EDIT: the worst part is that it blocks execution of the app, stopping the rendering (can't even show anything on screen)

EDIT2: this happened using 2017 LTS, 2018.3.6f1 and 2018.3.8f1

edgarsantospt on 14 Mar 2019

👍2 ❤1

All 11 comments

Had a similar issue when I was doing a demo in busy places(conference). It was an example scene with HTK.

cre8ivepark on 21 Feb 2019

You deserve a medal for figuring out the repro for this bug. :D

Ecnassianer on 21 Feb 2019

😄1

I started to see this issue after April update #2085

ivan2007 on 24 Feb 2019

Observed same issue on hololens when microphone is enabled.
Hololens OS - latest as of 2/27
Unity : 2018.3.6f1
Holotoolkit 2017.4.30

payyavula on 28 Feb 2019

I have the same exact issue. The exact line that is causing this for me is:
"keywordRecognizer = new KeywordRecognizer(keywords.Keys.ToArray());"

It just gets stuck there if there's noise (managed to 100% reproduce this by playing music while starting the app).

EDIT: the worst part is that it blocks execution of the app, stopping the rendering (can't even show anything on screen)

EDIT2: this happened using 2017 LTS, 2018.3.6f1 and 2018.3.8f1

edgarsantospt on 14 Mar 2019

👍2 ❤1

Does this reproduce using v2 of MRTK on Unity 2018.3 or newer?

davidkline-ms on 30 May 2019

@davidkline-ms - I'm 99% sure the answer is yes, based on the details provided in issue #4487.

ryantrem on 30 May 2019

👍1

Yes it does because it is not a bug caused by MRTK. More details:

edgarsantospt on 31 May 2019

👍1

we should make sure MRTK dictation system has the workaround until Unity or Platform fixes the issue.

Yoyozilla on 4 Jun 2019

copy/paste from the other bug
Ported to github, reference 21521102

This is unconfirmed, but seems to be highly likely based on other recent observations, so we wanted to get it on the radar. The problem is that Windows.Media.SpeechRecognition.SpeechRecognizer can block for up to 20 seconds while processing audio input, and this can happen during instantiation. SpeechRecognizer is wrapped by Unity's KeywordRecognizer, which is initialized on the Unity main thread, and in turn instantiates SpeechRecognizer on the Unity main thread. If you launch an app that uses Unity's KeywordRecognizer in a noisy environment, the app "hangs" on SpeechRecognizer, and PLM will kill the app as it appears to be unresponsive.

Ideally, this would be fixed in SpeechRecognizer, but this work isn't currently planned.

The next best place to have this fixed is in Unity's wrapper (KeywordRecognizer), but as far as I can tell Unity does not intend to implement a workaround for the underlying Windows API bug. We could try to push on this more, but I don't think it is currently being considered.

In the meantime, it would probably make sense to work around the problem in MRTK's WindowsSpeechInputProvider. The recommendation from Unity is to asynchronously instantiate the SpeechRecognizer explicitly on a background thread, and delay instantiation of KeywordRecognizer until instantiation is complete (see https://forum.unity.com/threads/hololens-app-gets-stuck-during-keywordrecognizer-constructor-when-in-a-noisy-environment.648853/). Presumably this works because there is some global/singleton underlying state for SpeechRecognizer that all instances use, not sure.

I think in MRTK this would mean speech input might not work in a noisy environment until 20+ seconds after launching the app, but this might be the best we can do if the underlying Windows bug is not addressed.

Yoyozilla on 4 Jun 2019

Something that i had noticed while looking more deeply into speech related bugs - Unity API’s exposes DictationRecognizer/keywordrecognizer methods as synchronous methods (Start, Stop etc). However, when we build this for UWP, behind the scenes the SAPI API’s trigger couple of async/ await calls. The Unity API returns control to the main thread to prevent UI blocking ( behind the scenes SAPI could take up to 20s to initialize) but platform is still in the middle of initializing/starting/stopping dictation recognizer. What I see as a fundamental issue here is the disconnect where unity main thread is tricked into believing that the calls it made have successfully completed, even though they have not. This results in a race condition where the main thread might start using the speech api’s without speech being initialized. I believe a similar issue exists for the speech recognizer.

A potential fix for this might be, if the dictation recognizer and keyword recognizer raised events signaling that the system is fully initialized/started/stopped. If we put a workaround in MRTK that would make apps agnostic to this bug which is great, however we might end up masking this fundamental issue.