I took an app built with MRTK to a very noisy factory environment and it took > 5m to open after the display of the splash screen. I found that apps like '3D Viewer' built (AFAIK) without the MRTK didn't seem to suffer from this problem.
It should be noted that the app uses voice commands.
The apps would open in a reasonable time.
The apps don't open in a reasonable time - e.g. > 5m from splash screen.
Take an app built with the toolkit using voice commands and run it up in an extremely noisy environment. You should find that it hangs for minutes after the splash screen & if you stop it in the debugger you'll find that it is stuck somewhere spinning around trying to deal with speech. I haven't yet diagnosed whether this is caused by MRTK code or by UWP code or some combination.
2018.3.2f1
Not 100% sure - think it's the latest release at the time of writing - 2017.4.3.0.
Had a similar issue when I was doing a demo in busy places(conference). It was an example scene with HTK.
You deserve a medal for figuring out the repro for this bug. :D
I started to see this issue after April update #2085
Observed same issue on hololens when microphone is enabled.
Hololens OS - latest as of 2/27
Unity : 2018.3.6f1
Holotoolkit 2017.4.30
I have the same exact issue. The exact line that is causing this for me is:
"keywordRecognizer = new KeywordRecognizer(keywords.Keys.ToArray());"
It just gets stuck there if there's noise (managed to 100% reproduce this by playing music while starting the app).
EDIT: the worst part is that it blocks execution of the app, stopping the rendering (can't even show anything on screen)
EDIT2: this happened using 2017 LTS, 2018.3.6f1 and 2018.3.8f1
Does this reproduce using v2 of MRTK on Unity 2018.3 or newer?
@davidkline-ms - I'm 99% sure the answer is yes, based on the details provided in issue #4487.
Yes it does because it is not a bug caused by MRTK. More details:
we should make sure MRTK dictation system has the workaround until Unity or Platform fixes the issue.
copy/paste from the other bug
Ported to github, reference 21521102
This is unconfirmed, but seems to be highly likely based on other recent observations, so we wanted to get it on the radar. The problem is that Windows.Media.SpeechRecognition.SpeechRecognizer can block for up to 20 seconds while processing audio input, and this can happen during instantiation. SpeechRecognizer is wrapped by Unity's KeywordRecognizer, which is initialized on the Unity main thread, and in turn instantiates SpeechRecognizer on the Unity main thread. If you launch an app that uses Unity's KeywordRecognizer in a noisy environment, the app "hangs" on SpeechRecognizer, and PLM will kill the app as it appears to be unresponsive.
Ideally, this would be fixed in SpeechRecognizer, but this work isn't currently planned.
The next best place to have this fixed is in Unity's wrapper (KeywordRecognizer), but as far as I can tell Unity does not intend to implement a workaround for the underlying Windows API bug. We could try to push on this more, but I don't think it is currently being considered.
In the meantime, it would probably make sense to work around the problem in MRTK's WindowsSpeechInputProvider. The recommendation from Unity is to asynchronously instantiate the SpeechRecognizer explicitly on a background thread, and delay instantiation of KeywordRecognizer until instantiation is complete (see https://forum.unity.com/threads/hololens-app-gets-stuck-during-keywordrecognizer-constructor-when-in-a-noisy-environment.648853/). Presumably this works because there is some global/singleton underlying state for SpeechRecognizer that all instances use, not sure.
I think in MRTK this would mean speech input might not work in a noisy environment until 20+ seconds after launching the app, but this might be the best we can do if the underlying Windows bug is not addressed.
Something that i had noticed while looking more deeply into speech related bugs - Unity API鈥檚 exposes DictationRecognizer/keywordrecognizer methods as synchronous methods (Start, Stop etc). However, when we build this for UWP, behind the scenes the SAPI API鈥檚 trigger couple of async/ await calls. The Unity API returns control to the main thread to prevent UI blocking ( behind the scenes SAPI could take up to 20s to initialize) but platform is still in the middle of initializing/starting/stopping dictation recognizer. What I see as a fundamental issue here is the disconnect where unity main thread is tricked into believing that the calls it made have successfully completed, even though they have not. This results in a race condition where the main thread might start using the speech api鈥檚 without speech being initialized. I believe a similar issue exists for the speech recognizer.
A potential fix for this might be, if the dictation recognizer and keyword recognizer raised events signaling that the system is fully initialized/started/stopped. If we put a workaround in MRTK that would make apps agnostic to this bug which is great, however we might end up masking this fundamental issue.
Most helpful comment
I have the same exact issue. The exact line that is causing this for me is:
"keywordRecognizer = new KeywordRecognizer(keywords.Keys.ToArray());"
It just gets stuck there if there's noise (managed to 100% reproduce this by playing music while starting the app).
EDIT: the worst part is that it blocks execution of the app, stopping the rendering (can't even show anything on screen)
EDIT2: this happened using 2017 LTS, 2018.3.6f1 and 2018.3.8f1