Runtime: Bring System.Speech to .Net Core or add some alternative

Created on 27 Sep 2019  Â·  21Comments  Â·  Source: dotnet/runtime

System.Speech API is not available in .Net Core or .Net standard. Currently, there are no alternatives for synthesizing audio locally in these frameworks. Kindly bring an API for this.

I am trying to migrate a project from .Net Framework to .Net Core but this is preventing it.

area-Meta

Most helpful comment

@terrajobst Does .Net core has viable alternative option for local speech synthesis?

All 21 comments

@preethikurup @terrajobst have we already reached out to owners of System.Speech? I don't recall working with that team.

@preethikurup @terrajobst have we already reached out to owners of System.Speech? I don't recall working with that team.

Yes, they are now part of Azure and there is a new offering. System.Speech is not evolved anymore.

We don't plan on bringing this .NET Framework API to .NET Core. See this announcement for details.

@terrajobst Does .Net core has viable alternative option for local speech synthesis?

I'm also blocked by not having this.

This System.Speech namespace is VITAL to the visually disabled community, as vital as System.Console is for everybody else. Not having this has been a HUGE DEAL.

We NEED a viable, LOCAL speech synthesis API. An Azure Service that has to be used over the network would not work for the disabled community and WILL NOT WORK.

Maybe could use Microsoft.Speech functionality instead if System.Speech is not available

@birbilis There is no such package as Microsoft.Speech. Did you mean
Microsoft.CognitiveServices.Speech? If yes, that package will not work without connecting to Azure.

@kolappannathan there used to be a Microsoft.Speech one too, check my code at https://github.com/Zoomicon/SpeechLib/blob/master/SpeechLib.Recognition/SpeechRecognition.cs

Please move discussion to the open issue https://github.com/dotnet/wpf/issues/2935

@birbilis Wow. I searched in Nuget.org and couldn't find the said package. Is it still available?

The old SDK worked well and most importantly worked locally. The new one is about getting people to subscribe to it on Azure, which isn't that bad other than you have to be online for it it work. It's a shame this can't be done locally anymore.

I've ported it, if you need it:
https://github.com/ststeiger/VoiceRecognition/tree/master/System.Speech

However, the quality is borderline crap, I have no idea how you can get support for a particular language, and it only works on Windows.
(the speech sdk needs to be installed).
For some reason, more languages are available in the full .net framework version.

You better look for a more modern speech-sdk, such as from Facebook (wav2letter++), Baidu (deepSpeech2), Kaldi, Julius, CMUSphinx, Mozilla (DeepSpeech).

Google also has an excellent API for that, but it ain't particularly cheap. .

This is really unfortunate to run into, and it ends my interest in .NET Core/.NET 5. Without local speech, the platform is useless to me.

@ocdtrekkie: You could use kaldi-gstreamer-server , then you can do it from .NET via web-sockets:
https://github.com/alumae/kaldi-gstreamer-server

Anything that is a server or client that goes over the network or a network interface does not meet basic accessibility requirements here as the network increases lag, costs, etc and is a burden on the user.

Many who are blind don't even have easy access to the internet, as a computer with the accessibility tools required is often out of their price range and open source options are limited and actively fought against by the big companies that make the most money from selling themselves via insurance claims.

Also, a braille terminal is EXPENSIVE; most people who need them have to use insurance to buy them, because they can cost thousands of dollars depending on the model and most of the time people who want them don't have that money.

.Net Core NEEDS System.Speech.* and a Console.Speak(string text); standard to be inclusive and support the disabled. An Azure server or service is directly at odds with the accessibility requirements here and will not work,

However, the quality is borderline crap

This is common for speech synthesis; it always sounds bad to the community because a natural sounding voice requires vox sampling and other work that is not trivial. That's why Siri/Cortana/Google etc all use a sacrificial real persons voice as the base phonetic mappings for the retail public offering that sighted people get access to. But thankfully, the voice itself is a separate problem, so we don't need to worry about that here.

I have no idea how you can get support for a particular language

Register a voice for that language, that supports that language's phonetics. Again, that's a voice construction issue and not a part of this request, technically, since MS already has voices we can use to get this functional as a MVP.on windows.

The plugin for Linux/Mac support could just pass in a flite or festival based plugin for the v1. All it has to do is pass the language and text to be spoken and let the system handle that async.

and it only works on Windows.

That's easy to fix if the code is portable. Otherwise you need a public interface to call a sub-method that checks the current config/platform and the-current platforms installed voices, matches that platform and language to a voice, then calls that voice with the input text to synth the final speech. The method developer would need to worry about may have a signature like:

ResultEnum speak(Context context, string text);

.. where Platform and language used are in the context object and so the platform that is not supported could just return an enum value of ResultEnum.NOT_SUPPORTED_YET to show that if the engine being delegated to can not support it.

The speech engine could easily do this,. To provide support does not require full support for multiple platforms, it just needs windows support and a good architectural design that allows devs to add support to via code extension of a base object leveraging a common and public-enough interface. This could just be a plugin architecture and it would work.

For some reason, more languages are available in the full .net framework version.

Again, Voices are based on languages; so your going to expect to use a different voice based on your language preferences.

Adding support to the Microsoft.Windows.Compatibility would be great at least to port existing applications in Windows.

Speech and speech recognition systems are the fashion of the moment. unfortunately microsoft saw a great chance to make money there. that's why so many updates eliminating options for apis that existed in windows and the system.speech and system.SpeechSynthesizer were eliminated because with them running they were an obstacle for microsoft to sell azure. the worst thing is that the programmers accepted this manipulation and have encouraged the use of azure. however programmers who are unable to acquire licenses on azure, or who have difficulties maintaining a stable connection to use this service, find themselves in a dead end, this explains why microsoft wants to eliminate the .net framework as it kills the most of these old apis that do not generate income for microsoft. very sad. I have used these apis in a local offline application for communities that do not have access to the network in real time, and I am concerned by the moment when microsoft ends support for the .net framework.

I have used these apis in a local offline application for communities that do not have access to the network in real time, and I am concerned by the moment when microsoft ends support for the .net framework.

One cool feature is, you can use the text-to-speech engine integrated in Google Chrome from JavaScript.
That way, you can do it right in JavaScript.
No server-backend required, no server roundtrip necessary.

speechSynthesis.getVoices().forEach(function(voice) {
  console.log(voice.name, voice.default ? voice.default :'');
});

var msg = new SpeechSynthesisUtterance();
var voices = window.speechSynthesis.getVoices();
msg.voice = voices[10]; 
msg.volume = 1; // From 0 to 1
msg.rate = 1; // From 0.1 to 10
msg.pitch = 1; // From 0 to 2
msg.text = "Como estas Joel";
msg.lang = 'es';
speechSynthesis.speak(msg);

https://dev.to/asaoluelijah/text-to-speech-in-3-lines-of-javascript-b8h

Me desculpe porem nao conheço nada ainda de javascript

Enviado do Outlookhttp://aka.ms/weboutlook


De: Stefan Steiger notifications@github.com
Enviado: quarta-feira, 25 de novembro de 2020 13:05
Para: dotnet/runtime runtime@noreply.github.com
Cc: lindomar-marques55 lindomar55@live.com; Comment comment@noreply.github.com
Assunto: Re: [dotnet/runtime] Bring System.Speech to .Net Core or add some alternative (#30991)

One cool feature is, you can use the text-to-speech engine integrated in Google Chrome from JavaScript.
That way, you can do it right in JavaScript.
No server-backend required, no server roundtrip necessary.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHubhttps://github.com/dotnet/runtime/issues/30991#issuecomment-733761862, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMDTQVN37D6PWTEJJLSVHQLSRUML7ANCNFSM4MLDBSHA.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jamesqo picture jamesqo  Â·  3Comments

jchannon picture jchannon  Â·  3Comments

bencz picture bencz  Â·  3Comments

jzabroski picture jzabroski  Â·  3Comments

chunseoklee picture chunseoklee  Â·  3Comments