Deepspeech: import_cv2 fails with filter_alphabet option for validation dataset

Created on 5 Jun 2019 · 6Comments · Source: mozilla/DeepSpeech

Hello, I'm facing an issue when I try to construct the training data from the common voice dataset Version2, in the documentation it's stated to launch the bin/import_cv2.py script, it works totally fine except when I provide the --filter_alphabet option with the default alphabet.txt file of DeepSpeech.
When I launch this command :

bin/import_cv2.py --filter_alphabet /content/alphabet.txt /content

I get the following error :

Loading TSV file:  /content/train.tsv
Saving new DeepSpeech-formatted CSV file to:  /content/clips/train.csv
Importing mp3 files...
Progress |######################################################| 100% completedWriting CSV file for DeepSpeech.py as:  /content/clips/train.csv
Progress |######################################################| 100% completed
Imported 12123 samples.
Skipped 103 samples that failed on transcript validation.
Skipped 12 samples that were longer than 10 seconds.
Final amount of imported audio: 14:52:21.
Loading TSV file:  /content/test.tsv
Saving new DeepSpeech-formatted CSV file to:  /content/clips/test.csv
Importing mp3 files...
Progress |##################################################### |  98% completedWriting CSV file for DeepSpeech.py as:  /content/clips/test.csv
Progress |######################################################| 100% completed
Imported 6810 samples.
Skipped 360 samples that failed on transcript validation.
Skipped 206 samples that were longer than 10 seconds.
Final amount of imported audio: 10:21:17.
Loading TSV file:  /content/dev.tsv
Saving new DeepSpeech-formatted CSV file to:  /content/clips/dev.csv
Importing mp3 files...
Progress |############################################          |  82% completedTraceback (most recent call last):
  File "bin/import_cv2.py", line 165, in <module>
    _preprocess_data(PARAMS.tsv_dir, AUDIO_DIR, label_filter_fun, PARAMS.space_after_every_character)
  File "bin/import_cv2.py", line 43, in _preprocess_data
    _maybe_convert_set(input_tsv, audio_dir, label_filter, space_after_every_character)
  File "bin/import_cv2.py", line 100, in _maybe_convert_set
    for i, _ in enumerate(pool.imap_unordered(one_sample, samples), start=1):
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 761, in next
    raise value
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "bin/import_cv2.py", line 95, in one_sample
    counter['total_time'] += frames
UnboundLocalError: local variable 'frames' referenced before assignment

I'm running the training on google's Colab with a Nvidia Tesla K80 12Gb of VRAM with tensorflow-gpu installed.

Source

Wissben

Most helpful comment

@lissyx Looks like your "Computing audio hours at import " commit uses frames when it's not assigned to. All other frame uses seem to be protected by a check that file_size and thus frames is assigned to.