I were running TED demo(run-ted.sh) and met such problem.
tensorflow.python.framework.errors_impl.InvalidArgumentError: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 4 num_classes: 29 labels: 9,0,1,12,15,14,7,0,23,9,20,8,0,8,21,14,4,18,5,4,19,0,15,6,0,15,20,8,5,18,0,22,15,12,21,14,20,5,5,18,19,0,11,14,5,23,0,23,5,0,3,15,21,12,4,14,20,0,10,21,19,20,0,19,9,20,0,1,20,0,8,15,13,5,0,19,15,0,9,0,4,5,3,9,4,5,4,0,20,15,0,10,15,9,14,0,20,8,5,13,0,6,15,18,0,20,8,18,5,5,0,23,5,5,11,19,0,15,14,0,13,1,25,0,20,8,5,0,20,8,9,18,20,5,5,14,20,8,0,9,0,13,1,4,5,0,13,25,0,23,1,25,0,20,15,0,20,8,5,0,20,15,23,14,0,15,6,0,15
[[Node: tower_1/CTCLoss = CTCLoss[ctc_merge_repeated=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](tower_1/Reshape_7/_153, tower_1/ToInt64/_267, tower_1/Gather, tower_1/padding_fifo_queue_DequeueMany:1)]]
[[Node: tower_1/edit_distance/_271 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:1", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_1885_tower_1/edit_distance", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:1"]()]]
Caused by op 'tower_1/CTCLoss', defined at:
File "DeepSpeech.py", line 1136, in
last_train_wer, last_dev_wer, hibernation_path = train()
File "DeepSpeech.py", line 1040, in train
train_context = create_execution_context('train')
File "DeepSpeech.py", line 790, in create_execution_context
tower_results = get_tower_results(data_set, optimizer=optimizer)
File "DeepSpeech.py", line 469, in get_tower_results
calculate_accuracy_and_loss(batch_set, no_dropout if optimizer is None else dropout_rates)
File "DeepSpeech.py", line 351, in calculate_accuracy_and_loss
total_loss = ctc_ops.ctc_loss(labels=batch_y, inputs=logits, sequence_length=batch_seq_len)
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/ctc_ops.py", line 145, in ctc_loss
ctc_merge_repeated=ctc_merge_repeated)
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/gen_ctc_ops.py", line 164, in _ctc_loss
name=name)
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1264, in __init__
self._traceback = _extract_stack()
I've never seen this problem before. It appears
...
File "/data1/tangzy/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1264, in init
...
as if you are running python 3.5. We have never to my knowledge tested on python 3.X.
Is is possible for you to test on python 2.7 and see if the problem persists?
@Tangzy7 We've updated our code for python 3 now. Could you recheck with master? Thanks!
@Tangzy7 I'm going to close this as inactive.
anyone found an answer?
@danFromTelAviv the Unicode supported added some time ago should make this essentially impossible to happen. You should get a KeyError in text.py before you ever get an error in the CTC loss function. Are you by any chance using a (very) old version of the code?
I resolved the issue ( feel a little dumb in retrospect ). You must specify that the number of classes is one more than the number of characters you have.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
I resolved the issue ( feel a little dumb in retrospect ). You must specify that the number of classes is one more than the number of characters you have.