Spleeter: Can this be used as a noise cleansing from conversation audio file?

Created on 5 Nov 2019 · 2Comments · Source: deezer/spleeter

Hello Team,
Thank you for your effort in building such a fantastic package. This is not a bug , its more of understanding what this package can do.
Please put your suggestion if spleeter can be used for noise cleansing from audio files. As per my understanding / analogy if I consider noise as music and conversation as singer's voice, then can I separate noise from audio conversation using spleeter.
Your input will be helpful.

question

Source

AIGyan

Most helpful comment

I am not affiliated with the project, but FWIW :

I grabbed 1:00 to 3:00 of the audio of this BMXer recording himself on a GoPro camera, riding a bike around Los Angeles and speaking to the camera.

https://www.youtube.com/watch?v=a0vgJ3TeJzA

I then ran it through the 2stem spleeter :

spleeter separate -o ./audio_output/ -p spleeter:2stems -i input/00-YOUTUBE/a0vgJ3TeJzA.flac

I then merged the two files into one, with each file panned hard right and left :

ffmpeg -i vocals.wav -i accompaniment.wav -filter_complex "[0:a][1:a]amerge=inputs=2,pan=stereo|c0<c0+c1|c1<c2+c3[a]" -map "[a]" stereo.split.accomp.and.vocals.wav

The result was the bicyclist talking in my left ear, and a bunch of background noise in my right ear. When I removed the right ear, I heard the speaking quite clearly in my left ear. When I removed the left ear, I heard almost no speaking in the right ear. The only time I heard his voice in the right ear was when it was mistakable for a percussive drum sound. I sometimes heard people other than him speaking in the left ear, when he biked past people who were talking.

I believe this is sufficient proof of the concept of using spleeter's 2stem model to isolate spoken human voices from recordings with background noise. Try it yourself and see!

awesomer on 6 Nov 2019

👍8 🚀2

All 2 comments

I am not affiliated with the project, but FWIW :

I grabbed 1:00 to 3:00 of the audio of this BMXer recording himself on a GoPro camera, riding a bike around Los Angeles and speaking to the camera.

https://www.youtube.com/watch?v=a0vgJ3TeJzA

I then ran it through the 2stem spleeter :

spleeter separate -o ./audio_output/ -p spleeter:2stems -i input/00-YOUTUBE/a0vgJ3TeJzA.flac

I then merged the two files into one, with each file panned hard right and left :

ffmpeg -i vocals.wav -i accompaniment.wav -filter_complex "[0:a][1:a]amerge=inputs=2,pan=stereo|c0<c0+c1|c1<c2+c3[a]" -map "[a]" stereo.split.accomp.and.vocals.wav

I believe this is sufficient proof of the concept of using spleeter's 2stem model to isolate spoken human voices from recordings with background noise. Try it yourself and see!

awesomer on 6 Nov 2019

👍8 🚀2

Hi @AIGyan

We haven't done any sort of evaluation on this task, nor was our model trained on such examples. Speech enhancement and denoising being an active research field, I assume there are more specialized tools to do that out there.

That being said I can only agree with @awesomer feel free to try it out and let us know what you find!

mmoussallam on 7 Nov 2019

👍2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

[Bug] 4stems mode stops executing without output and exception reported

Rohan-Kishibe · 5Comments

[Discussion] Separation of live audio from microphone

Rahul-Sindhu · 4Comments

[Bug] Fatal error in launcher: Unable to create process using '"d:\bld\spleeter_1574775894867_h_env\python.exe" "C:\Users\(Me)\Anaconda3\Scripts\spleeter.exe" separate -i spleeter/BLEACH.mp3 -p spleeter:4stems -o output'

NXG2005 · 4Comments

spleeter:2stems protocole not found

eoeintu · 4Comments

[Bug] ffprobe error: No such file or directory

JohnSmith2007 · 4Comments