Mumble: Test and improve RNNoise and add some info in UI & wiki

Created on 17 May 2020 · 38Comments · Source: mumble-voip/mumble

Context:
RNNoise, Audiofilters

Description:
As pointed out by @fedetft in https://github.com/mumble-voip/mumble/issues/4127#issuecomment-629653914

libspeexdsp already includes a noise-canceller that is enabled by default
rnnoise is maybe not implemented correctly and should be tested once more
rnnoise might interfere with libspeexdsp (should be tested, whether thats the case)

Potencial Todo:

[ ] Test RNNoise (again) alongside all other Audio-Filters (Noise-Cancel by libspeex & Echo-Cancel by libspeex)
[ ] Check whether noise-cancel by libspeex can (or should) be deactivated, if RNNoise is used
[x] (fixed in #4212) Improve the implementation of RNNoise (especially the order of the filters)
[ ] (optional) inform users that noise-cancel is already active (by libspeex) and that RNNoise is an additional filter (later this could be modified to other potencial usecases, listed above)

documentation

Source

toby63

Most helpful comment

Maybe remove it until it's fixed?

I would still recommend to disable it or notify the user not to use it for now.

A compromise would be: how about labeling it as experimental feature?

I strongly disagree with labeling RNNoise as experimental and especially so with removing it. RNNoise is one of the prominent features of Mumble and everyone that I know who has used it remarks that it is the best noise suppression out of every voice chat program that they have ever used. Removal or discouraging use of RNNoise would force me to maintain a fork of and provide builds of Mumble retaining the feature. To those who are advocating against it: Have you actually used it?

TredwellGit on 18 May 2020

👍5

All 38 comments

libspeexdsp already includes a noise-canceller that is enabled by default
rnnoise is maybe not implemented correctly and should be tested once more
rnnoise might interfere with libspeexdsp

That's not information that concerns the end-user. Therefore it shouldn't be in the UI. If you want to, you can create a wiki page for it, so everyone interested can have a look.

Krzmbrzl on 17 May 2020

Explain RNNoise better.

What kind of explanation would you like to have in the UI?

Krzmbrzl on 17 May 2020

Let's wait until theRNNoise / speexdsp stuff is fixed / improved. I would like to have an option to switch between SpeexDSP, RNNoise and "off". Then we could add some short explanation about the differences.

I don't think it is useful to add a list of reasons why you shouldn't use RNNoise. Maybe remove it until it's fixed?

streaps on 17 May 2020

Maybe remove it until it's fixed?

Uhm I guess that'd be an option. On the other hand if we remove it now, it'll seem like a step backwards to the end-user, so maybe given that it has been in this state for a while I don't think that it'll hurt to leave it that way until it's fixed...

Krzmbrzl on 17 May 2020

👍2

Uhm I guess that'd be an option. On the other hand if we remove it now, it'll seem like a step backwards to the end-user, so maybe given that it has been in this state for a while I don't think that it'll hurt to leave it that way until it's fixed...

Well it might not function in the way the user desires, this could be seen as a bug and bugs should be fixed.
I would still recommend to disable it or notify the user not to use it for now.

A compromise would be: how about labeling it as experimental feature?

Regarding the info: The user should at least be informed that noise cancelling is already happening (through libspeex).
Of course he doesn't need to know all the technical details, but he should know that it is not necessary to enable RNNoise.

toby63 on 17 May 2020

👎1

My personal observation after testing over the last few days, is RNNoise is the only thing is that lets me use voice activity mode with my microphone without triggering voice activity when i type (due to position of a mechanical keyboard and microphone).

jj777 on 18 May 2020

👍1

Maybe remove it until it's fixed?

I would still recommend to disable it or notify the user not to use it for now.

A compromise would be: how about labeling it as experimental feature?

TredwellGit on 18 May 2020

👍5

I guess the removal discussion is settled then. It's obviously good enough that it's useful for some users.

streaps on 18 May 2020

👍3

@TredwellGit
If it works fine, I won't object against it being kept activated :slightly_smiling_face: .

I only wanted to give some information that came up, thanks to someone looking at the code.

Nonetheless I think the general purposes of this issue remain:

RNNoise should be tested once more and the implementation improved if necessary
Users should be informed about noice-cancel already active (with libspeex) and that RNNoise is just an additional filter

I changed the title for that.

toby63 on 18 May 2020

👍1

Users should be informed about noice-cancel already active (with libspeex) and that RNNoise is just an additional filter

We shouldn't add warnings directed to the user for problems in the code. These problems should be fixed.

streaps on 18 May 2020

👍1

RNNoise is the only thing is that lets me use voice activity mode with my microphone without triggering voice activity when i type (due to position of a mechanical keyboard and microphone).

Shouldn't VAD be able to distinguish between voice and keyboard sounds (with or without RNNoise enabled)? I wonder if Voice Activity works more like a noise gate in Mumble.

streaps on 18 May 2020

Shouldn't VAD be able to distinguish between voice and keyboard sounds (with or without RNNoise enabled)? I wonder if Voice Activity works more like a noise gate in Mumble.

Might be a case for another issue report then :wink:.

We shouldn't add warnings directed to the user for problems in the code. These problems should be fixed.

Well, I think a notice about already active noise filters is not a warning, it just clarifies to only use rnnoise of you still have noise problems and want to solve them.

toby63 on 18 May 2020

VAD in Mumble is currently a bit basic. You have two options:

Amplitude: threshold based: anything that hasn't been removed by the echo/noise canceller that has an high enough amplitude triggers the voice activation
Signal to noise: using libspeexdsp, "replaced by a hack pending a complete rewrite" by the libspeexdsp developers. No idea what the hack is in detail. Is anyone using this option?

Also consider that before https://github.com/mumble-voip/mumble/pull/4167 any "smart" voice activation that would try to detect an actual voice was basically impossible in Mumble, as it would have been triggered by the echo...

fedetft on 18 May 2020

👍1

I was using signal to voice but it tended to pass "voice" if i was silent for too long. Guess that ratio is relative to the previous audio frames or something like that...

Krzmbrzl on 18 May 2020

In any case, my comment from having read the code is that rnnoise seems to have been bodged in rather than being well integrated, or at least the curious design choices that have been made have not been documented in the source code.

In particular:

Rnnoise is applied before the echo canceller, while libspeexdsp's noise canceller is applied after, why?
When rnnoise is enabled, libspeexdsp's noise canceller is not disabled, leading to two noise cancellers being run, and with the echo canceller in the middle, why?

These choices may also have unintended consequences:

Rnnoise may impact the efficacy of the echo canceller
We may be wasting CPU for nothing by running two noise cancellers in a row

I think that it may make sense to try a more conventional configuration, such as disabling libspeexdsp's noise canceller when rnnoise is active and putting it after the echo canceller. Can someone like @TredwellGit try such a configuration and see whether rnnoise works just as well?

Once rnnoise is properly integrated, Mumble devs should also advertise it more, both in the input configuration and in the wiki, otherwise potentially interested users will have a hard time discovering it.

fedetft on 18 May 2020

👍3

RNNoise works well at filtering out typing sounds and eating/chewing food sounds. Whatever replaces it, if anything, shouldn't be any worse in these aspects.

trudnorx on 18 May 2020

👍1

RNNoise also provides VAD, but we currently rely on libspeexdsp for that.

We should always use RNNoise's VAD when available and provide an option to completely disable the Speex preprocessor (#3323).

As for improving our RNNoise implementation: last year the library's API was changed so that it allows to save/load the machine learning progress. Ideally, we should save the progress either in the configuration file or in the SQLite database.

davidebeatrici on 19 May 2020

👍2

If rnnoise provides VAD, then that's another strong point in favor of putting it after echo cancellation, so it doesn't consider echo as a voice.

After https://github.com/mumble-voip/mumble/pull/4167 is merged I might find the time to propose a patch for the rnnoise users to try out.

fedetft on 19 May 2020

👍4

I don't see any advantage in using noise suppression before the echo canceller, especially not with a strong noise suppressor like RNNoise. I think putting RNNoise behind the echo canceller is a win-win for both.

I don't have much experience with that stuff and I didn't do any tests, but my basic understanding of audio processing and experience tells me that putting a noise suppressor before the echo canceller makes no sense at all. Please correct me, if I'm wrong.

streaps on 19 May 2020

My one cent for this discussion: just like @TredwellGit, I use RNNoise extensively, and it's vital to my friends not wanting to kill me for my noisy setup.

My setup has the following flaws:

My house is very old, and the electrical wiring is so badly done that the earthing got oxidised and is no longer working. This, plus the already existing noise in the wires, means that a lot of noise gets through to my computer chassis whenever I'm not on battery power.
My headset is okay, but not great. It has slight electronic noises stemming from its poor connections, and those become very apparent whenever I use the volume controls and the mute switch.

Yet all of these, I repeat, _all of these_, get removed by RNNoise. I am astounded by how good it is. It really is a marvel. These are good times to be alive.

felix91gr on 25 May 2020

👍1

@jj777 @trudnorx @felix91gr as you seem to be power users of the RNNoise feature in Mumble, could you please test the changes made in #4212 ?
We'd need some feedback to verify that it still works as expected :point_up:

Krzmbrzl on 29 May 2020

I can do some testing today/tomorrow - just to confirm, I should be building using mumble-releng's 1.3 scripts and the branch above? I saw a comment somewhere in the post re: the echo canceller fixes saying it wouldn't work until 1.4 - but I can't get that windows build of that compiling

jj777 on 30 May 2020

All new features are built again 1.4.0 (current master).
You can simply download the windows installer from the CI though, so you don't have to compile it yourself :)

Krzmbrzl on 30 May 2020

👍1

So just did a test with another user on my server.

This is the first time we've done a test with both people on continuous transmission mode and both on speakers - Windows 10 x64 both using the same CI build from yesterday (30/05).

Some notes:

First off, we noticed echo cancellation wasn't working very well - however, when we ignored the Mumble text instructions around maximising the mic boost in Windows - we both reduced the Mic Windows from 100 to 75-80ish and then the echo canceller seemed to work pretty well (it may make sense to adjust this guidance on the help text before 1.4 goes live).
We had a test conversation with me playing music on speakers - and the music was pretty much cancelled out properly (i.e. the most listening back to the recording was a tiny quarter sec bit that you'd need to be looking for).
RNNoise seems to work great still in this setup - we did a recording test and it was almost impossible to hear me typing away on my Cherry Blue keys. When I disabled it, it appeared that my typing was very noticeable ("please stop").

EDIT: (Though I just noted that the changes in #4212 haven't been merged in yet, so we may have just tested the new echo canceller and old implementation instead based on that build). Let me know if there's a way of getting a binary to test of the 4212 code.)

jj777 on 31 May 2020

👍1

Did you use the installer from the CI? If so you had the changes of that PR included

Krzmbrzl on 31 May 2020

👍1

First off, we noticed echo cancellation wasn't working very well - however, when we ignored the Mumble text instructions around maximising the mic boost in Windows

Maximising the mic boost shouldn't be advisable. This is not a analog tape recorder or a guitar amp. Ideally you want plenty of headroom to avoid any digital clipping (or major analog distortion). If the voice is not loud enough it could be amplified after the ADC in the digital domain. It really doesn't matter much, if you lose 12 dB or 24 dB dynamic range, because the mic input is not levelled to maximum.

streaps on 31 May 2020

Did you use the installer from the CI? If so you had the changes of that PR included

Yep, from the link I included - should be right then! We were pretty happy with what we tested.

jj777 on 1 Jun 2020

First off, we noticed echo cancellation wasn't working very well - however, when we ignored the Mumble text instructions around maximising the mic boost in Windows

Maximising the mic boost shouldn't be advisable. This is not a analog tape recorder or a guitar amp. Ideally you want plenty of headroom to avoid any digital clipping (or major analog distortion). If the voice is not loud enough it could be amplified after the ADC in the digital domain. It really doesn't matter much, if you lose 12 dB or 24 dB dynamic range, because the mic input is not levelled to maximum.

Yep, agree - I think this is the bit where the wording should be potentially reconsidered:

jj777 on 1 Jun 2020

👍1

That's OT to this issue. Feel free to create a PR with the respective changes though :)
(The file in question would be https://github.com/mumble-voip/mumble/blob/master/src/mumble/AudioWizard.ui - can be edited with Qt Designer)

Krzmbrzl on 1 Jun 2020

The audio wizard is useless anyway.

streaps on 1 Jun 2020

😕1

@streaps wasn't for me!

felix91gr on 1 Jun 2020

Slightly related question, does Mumble do any audio processing, filteration, modification, or similar? Input or output side that cannot be disabled? Equalization, loudness, compression, anything?

For example, if you have RNNoise off, Supression off, Amplification set to 1, is the audio completely direct? Or is there still stuff going on?

grravity on 20 Jul 2020

If the audio stream is not at 48KHz, mumble resamples it to that bitrate.

Then, the following speex filters are unconditionally enabled:

auto gain control
dereverb

fedetft on 20 Jul 2020

@fedetft When you day not at 48KHz, do you mean higher or lower?
And how come we have no ability to change this?

grravity on 20 Jul 2020

When you day not at 48KHz, do you mean higher or lower?

@grravity this usually means lower. Audio nowadays is sampled either at 44.1 KHz or at 48 KHz.

And how come we have no ability to change this?

I don't think it's very sensible to add that ability. Lemme explain why I think that:

The human ear can listen up to about 20 KHz. By the Nyquist-Shannon sampling theorem, you can replicate our experience by sampling audio at double that frequency, i.e., 40 KHz. 48 KHz has a good headroom on top of that, and is otherwise more than enough for the human ear.
48 KHz and 44.1 KHz are the two main standards for audio sampling, 48 being the most used of the two. Putting in a different sampling rate from those two would mean that support would be harder to achieve at a lower level of the stack, because they are standards.

Basically, I feel that the costs outweigh any benefits that you could get from a different sampling rate.

felix91gr on 20 Jul 2020

👍1

Ok this is understandable if it means lower, but if it was higher then I was slightly confused. I fail to see any reason not to set the standard for the rate so this makes sense. Understood. I am a little curious as to why the Speex filters being added would benefit this case? Dereverb and auto gain control? How do those play into the question?

Just curiously, what occurs if it is 48KHz or higher? This would mean that no processing, modulation, or filtering occurs? Which goes back to my original question.

grravity on 21 Jul 2020

The Opus codec supports 48 kHz sample rate only.

streaps on 22 Jul 2020

Also RNNoise only supports 48KHz, there's no easy way for mumble to support higher sample rates other than rewriting from scratch the dsp stuff it relies upon.

fedetft on 25 Jul 2020

Was this page helpful?

0 / 5 - 0 ratings