attention ocr could not converge, it is a bad project, the loss gradually increases. The dataset
is double checked, with 8202 classes.
2018-01-22 17:41:30.359779: I tensorflow/core/kernels/logging_ops.cc:79] lbls[5021][2178][2010][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][127][657][6862][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:30.460314: I tensorflow/core/kernels/logging_ops.cc:79] lbls[209][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][7355][3947][4114][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:31.147930: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:31.148958: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:31.149403: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:31.149848: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:31.150309: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:31.150345: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:31.150855: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:31.150888: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:31.219392: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:31.219826: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:31.221575: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][][5554 5554 5554...][16][16][8 16]
2018-01-22 17:41:31.222405: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[7382][7382][5554][5554][5554][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][][5554 5554 5554...][16][16][8 16]
2018-01-22 17:41:31.231981: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[31.543655]
2018-01-22 17:41:31.232015: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[34.622742]
INFO:tensorflow:global step 1: loss = 34.7808 (4.958 sec/step)
INFO 2018-01-22 17:41:31.000788: tf_logging.py: 82 global step 1: loss = 34.7808 (4.958 sec/step)
2018-01-22 17:41:31.844581: I tensorflow/core/kernels/logging_ops.cc:79] lbls[5][2038][4389][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][6802][2178][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:31.858722: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:31.860548: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:31.861751: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:31.861780: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:31.896185: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:31.897872: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][][5554][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][][5554 1736 1736...][16][16][8 16]
2018-01-22 17:41:31.902992: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[29.514463]
INFO:tensorflow:global step 2: loss = 29.6725 (0.226 sec/step)
INFO 2018-01-22 17:41:32.000061: tf_logging.py: 82 global step 2: loss = 29.6725 (0.226 sec/step)
2018-01-22 17:41:32.077037: I tensorflow/core/kernels/logging_ops.cc:79] lbls[2864][2648][5646][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][6376][2121][1203][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:32.089354: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:32.090356: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:32.091563: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:32.091597: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:32.127391: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:32.129693: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][][1736][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][][1736 5238 5238...][16][16][8 16]
2018-01-22 17:41:32.135553: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[36.587006]
INFO:tensorflow:global step 3: loss = 36.7450 (0.164 sec/step)
INFO 2018-01-22 17:41:32.000228: tf_logging.py: 82 global step 3: loss = 36.7450 (0.164 sec/step)
2018-01-22 17:41:32.244753: I tensorflow/core/kernels/logging_ops.cc:79] lbls[2430][2331][2602][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][3750][6375][7678][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:32.256183: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:32.257116: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:32.258149: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:32.258288: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:32.294241: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:32.296386: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[4200][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][][7287][4200][4200][4200][4200][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][][7287 4200 4200...][16][16][8 16]
2018-01-22 17:41:32.301545: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[32.544807]
INFO:tensorflow:global step 4: loss = 32.7028 (0.171 sec/step)
INFO 2018-01-22 17:41:32.000402: tf_logging.py: 82 global step 4: loss = 32.7028 (0.171 sec/step)
2018-01-22 17:41:32.415413: I tensorflow/core/kernels/logging_ops.cc:79] lbls[5608][6993][5888][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][2509][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:32.427194: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:32.428143: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:32.429181: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:32.429210: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:32.465629: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:32.468201: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[5238][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][][5238][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][][5238 2443 2443...][16][16][8 16]
2018-01-22 17:41:32.473847: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[33.510151]
INFO:tensorflow:global step 5: loss = 33.6682 (0.165 sec/step)
INFO 2018-01-22 17:41:32.000570: tf_logging.py: 82 global step 5: loss = 33.6682 (0.165 sec/step)
....
INFO 2018-01-22 17:45:08.000270: tf_logging.py: 82 global step 1292: loss = 181.0154 (0.156 sec/step)
2018-01-22 17:45:08.281153: I tensorflow/core/kernels/logging_ops.cc:79] lbls[1928][2675][6477][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][4963][269][6482][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:45:08.293001: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:45:08.294005: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:45:08.295178: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:45:08.295209: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:45:08.331004: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:45:08.332727: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][][2633 2633 2633...][16][16][8 16]
2018-01-22 17:45:08.337821: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[255.01726]
INFO:tensorflow:global step 1293: loss = 255.2530 (0.158 sec/step)
INFO 2018-01-22 17:45:08.000429: tf_logging.py: 82 global step 1293: loss = 255.2530 (0.158 sec/step)
2018-01-22 17:45:08.441829: I tensorflow/core/kernels/logging_ops.cc:79] lbls[2921][540][6477][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][1923][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:45:08.451025: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:45:08.451860: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:45:08.452752: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:45:08.452780: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:45:08.489094: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:45:08.490680: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][][2105][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][][2105 5888 5888...][16][16][8 16]
2018-01-22 17:45:08.495944: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[210.91998]
@chengmengli06 Please focus the conversation on technical points, which we're happy to discuss. Calling the project "bad" and the code "stupid" isn't helping to make your point, and just seems like trolling.
I'm assuming you're referring to:
https://github.com/tensorflow/models/tree/master/research/attention_ocr
Do you have any further technical insight into your problem? So far all you've said is that you used custom data (in some unspecified fashion), and your loss is increasing.
I'm closing this out, but if you do have further technical points, just post them and I'll be happy to re-open.
CC @alexgorban
Sorry about my rude comments, I still have no idea, the loss starts at 30 and then increasing, while for the fsns dataset the loss starts at > 100 and then decreasing, I will check the code again later. Currently I switch to the implementation by https://github.com/da03/Attention-OCR. Is there any points that I missed when switching to my own dataset? I double-checked my data, everything seems to be good. My data has only one view. I paste my modifications, and my dataset scripts below. My char dict include 8022 Chinese chars. @tatatodd
[code_change.txt](https://github.com/tensorflow/models/files/1673385/code_change.txt)
@alexgorban do you have any ideas here?
i have same result, the loss continue increasing to 30000+ ... i double checked my dataset have no mistake found.
I met with the same problem. After trying my own charset and the provided 134 charset, the problem still exists, where the loss keeps increasing and the inference gave junk output. Do you solve this problem finally? @chengmengli06 @tatatodd @zdnet
Plus, I tried my own dataset with the model from https://github.com/da03/Attention-OCR and the result was correct, which gave me 87% accuracy.
I'm sorry to hear that your experience issues with model convergence on your data. But I don't see how I can help without any additional information (code, data).
Please run all test and report the results
cd research/attention_ocr/python
python -m unittest discover -p '*_test.py'
if all tests are green, please provide links to your fork and the data (at least some samples).
I think I may have fixed this issue.
My solution is to add an index 0 which represent ' ' (a space after /t) to the first line of my dic.txt.
Don't forget to check the file encoding, it must be 'UTF-8 without BOM'.
morningdip's solution fixed my custom dataset training, thanks @morningdip but can you tell me how you found this out?
@morningdip Thanks for a wonderful solution. Can you please tell what are the considerations for creating charset file? I used ASCII codes to map letter. Is this correct? What is the correct way to assign id to
Also, can you share how to create charset file using python? I am using following code but it seems there is something wrong with the encoding; unable to create UTF-8 without BOM:
with open("charset.txt", 'w', encoding="utf-8") as f:
f.write("\n".join(content_lines))
Closing this issue since its resolved. Feel free to reopen if have any follow up questions. Thanks!
I think I may have fixed this issue.
My solution is to add an index 0 which represent ' ' (a space after /t) to the first line of my dic.txt.
Don't forget to check the file encoding, it must be 'UTF-8 without BOM'.
After adding space in start also loss is getting increased, training on 2000 images with batch size as 16
Even for me, when I train attention OCR on my custom data of 8000 images, my loss increases. How does one solve this problem?
Even for me, when I train attention OCR on my custom data of 8000 images, my loss increases. How does one solve this problem?
Can you share more details? Like input data image size etc
My input data consist of 8000 cropped number plate images. I have my own custom data with a single view only and my image size is (200,200). My batch size is 16 and I have tried changing the learning rate value to help the model converge. But that didn't help much.
Any guidance will be appreciated.
I think I may have fixed this issue.
My solution is to add an index 0 which represent ' ' (a space after /t) to the first line of my dic.txt.
Don't forget to check the file encoding, it must be 'UTF-8 without BOM'.
How did you create your record file with keeping index 0 as ' '??
Or after creating the record file you have manually added the index 0 as ' '??
I think I may have fixed this issue.
My solution is to add an index 0 which represent ' ' (a space after /t) to the first line of my dic.txt.
Don't forget to check the file encoding, it must be 'UTF-8 without BOM'.How did you create your record file with keeping index 0 as ' '??
Or after creating the record file you have manually added the index 0 as ' '??
Hey, in the charset label or your dict.txt file, you manually add the index 0 as ' '.
Most helpful comment
I think I may have fixed this issue.
My solution is to add an index 0 which represent ' ' (a space after /t) to the first line of my dic.txt.
Don't forget to check the file encoding, it must be 'UTF-8 without BOM'.