Deepspeech: Saw a non-null label (index >= num_classes - 1) following a null label

Created on 4 Mar 2017 · 7Comments · Source: mozilla/DeepSpeech

I were running TED demo(run-ted.sh) and met such problem.

tensorflow.python.framework.errors_impl.InvalidArgumentError: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 4 num_classes: 29 labels: 9,0,1,12,15,14,7,0,23,9,20,8,0,8,21,14,4,18,5,4,19,0,15,6,0,15,20,8,5,18,0,22,15,12,21,14,20,5,5,18,19,0,11,14,5,23,0,23,5,0,3,15,21,12,4,14,20,0,10,21,19,20,0,19,9,20,0,1,20,0,8,15,13,5,0,19,15,0,9,0,4,5,3,9,4,5,4,0,20,15,0,10,15,9,14,0,20,8,5,13,0,6,15,18,0,20,8,18,5,5,0,23,5,5,11,19,0,15,14,0,13,1,25,0,20,8,5,0,20,8,9,18,20,5,5,14,20,8,0,9,0,13,1,4,5,0,13,25,0,23,1,25,0,20,15,0,20,8,5,0,20,15,23,14,0,15,6,0,15
[[Node: tower_1/CTCLoss = CTCLoss[ctc_merge_repeated=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](tower_1/Reshape_7/_153, tower_1/ToInt64/_267, tower_1/Gather, tower_1/padding_fifo_queue_DequeueMany:1)]]
[[Node: tower_1/edit_distance/_271 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:1", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_1885_tower_1/edit_distance", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:1"]()]]

Caused by op 'tower_1/CTCLoss', defined at:
File "DeepSpeech.py", line 1136, in
last_train_wer, last_dev_wer, hibernation_path = train()
File "DeepSpeech.py", line 1040, in train
train_context = create_execution_context('train')
File "DeepSpeech.py", line 790, in create_execution_context
tower_results = get_tower_results(data_set, optimizer=optimizer)
File "DeepSpeech.py", line 469, in get_tower_results
calculate_accuracy_and_loss(batch_set, no_dropout if optimizer is None else dropout_rates)
File "DeepSpeech.py", line 351, in calculate_accuracy_and_loss
total_loss = ctc_ops.ctc_loss(labels=batch_y, inputs=logits, sequence_length=batch_seq_len)
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/ctc_ops.py", line 145, in ctc_loss
ctc_merge_repeated=ctc_merge_repeated)
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/gen_ctc_ops.py", line 164, in _ctc_loss
name=name)
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1264, in __init__
self._traceback = _extract_stack()

Source

Tangzy7

Most helpful comment

I resolved the issue ( feel a little dumb in retrospect ). You must specify that the number of classes is one more than the number of characters you have.

danFromTelAviv on 11 Jun 2018

👍3

All 7 comments

I've never seen this problem before. It appears

...
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1264, in init
...

as if you are running python 3.5. We have never to my knowledge tested on python 3.X.

Is is possible for you to test on python 2.7 and see if the problem persists?

kdavis-mozilla on 4 Mar 2017

@Tangzy7 We've updated our code for python 3 now. Could you recheck with master? Thanks!

kdavis-mozilla on 3 Apr 2017

@Tangzy7 I'm going to close this as inactive.

kdavis-mozilla on 15 Apr 2017

anyone found an answer?

danFromTelAviv on 30 May 2018

@danFromTelAviv the Unicode supported added some time ago should make this essentially impossible to happen. You should get a KeyError in text.py before you ever get an error in the CTC loss function. Are you by any chance using a (very) old version of the code?

reuben on 2 Jun 2018

👍1

I resolved the issue ( feel a little dumb in retrospect ). You must specify that the number of classes is one more than the number of characters you have.

danFromTelAviv on 11 Jun 2018

👍3

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.