Common-voice: When validating clips, allow user to click "Nope" early if the recording is long and incorrect.

Created on 29 Jul 2017  路  8Comments  路  Source: mozilla/common-voice

There's two really long useless recordings. It's weird that there's no limit to audio recording length.

The first one was "That it was unable to deal with this boy who spoke the language of the world". It sounds like they switched tabs and started reviewing voice samples.

The other was "That was until the site was blocked" which sounds like kitchen sounds.

I think the solution to this is to have the 'Nope' vote button become enabled after a couple of minutes of playback.

help wanted

Most helpful comment

Good question. In larger academic speech corpora the various non-speech acts (coughs, sneezes, breaths..) are transcribed into the text

_Hello <cough> how are you?_

and a speech-to-text system is trained to recognize and ignore the non-speech acts. However, we don't have the ability to transcribe the non-speech acts into our text.

So the questions are:

_Is it better to leave the non-speech acts in the audio with the understanding that a speech-to-text system will learn to ignore them as they don't appear in the transcript?_

or

_Is it better to mark audio with non-speech acts as invalid with understanding that a speech-to-text system trained on such data would get confused by the presence of the non-speech acts?_

I think with the progress of speech-to-text systems leaving the non-speech acts (coughs, sneezes, breaths..) in the audio is the way to go. Despite the fact that they will not appear in the transcript.

All 8 comments

Ahhh, what an annoying bug!

Here's some solutions:

  • Nope being enabled faster (as you suggested)
  • A skip button, which could solve some other UX issues at the same time
  • Automatically detect recordings that are too long by estimating syllable count

Of the three, I think I like your idea best. What do you think?

I like the nope idea the best out of this 3 options, but I would suggest to use a hard time limit on recordings, so there would be less stress on the servers as well.
Or the two combined? Nope available at 10(?) seconds and the hard time limit for recordings would be 20(?) seconds.

Edit: not related to this topic, but if somebody laughs/snorts/sneezes etc. before/after/during the sentence what should we vote on?

I prefer the original solution, but the other two sound interesting too.

not related to this topic, but if somebody laughs/snorts/sneezes etc. before/after/during the sentence what should we vote on?

My thinking is that you should ignore laughs/sneezes/etc (so long as they didn't say anything). @kdavis-mozilla ?

Good question. In larger academic speech corpora the various non-speech acts (coughs, sneezes, breaths..) are transcribed into the text

_Hello <cough> how are you?_

and a speech-to-text system is trained to recognize and ignore the non-speech acts. However, we don't have the ability to transcribe the non-speech acts into our text.

So the questions are:

_Is it better to leave the non-speech acts in the audio with the understanding that a speech-to-text system will learn to ignore them as they don't appear in the transcript?_

or

_Is it better to mark audio with non-speech acts as invalid with understanding that a speech-to-text system trained on such data would get confused by the presence of the non-speech acts?_

I think with the progress of speech-to-text systems leaving the non-speech acts (coughs, sneezes, breaths..) in the audio is the way to go. Despite the fact that they will not appear in the transcript.

Perhaps just start by seting a hard limit of (say) 60 seconds, and, after submission, silently discard samples longer than this? This would deal with the worst cases without needing to create any complex logic.

Imo the simplest thing we could do right now is to enable "voting no" after 1s of listening. Any objections to that?

3 seconds it is!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

selimsumlu picture selimsumlu  路  3Comments

jankeromnes picture jankeromnes  路  3Comments

nevik picture nevik  路  5Comments

mikehenrty picture mikehenrty  路  3Comments

ivonnekn picture ivonnekn  路  5Comments