Models: attention ocr could not converge on custom data

Created on 22 Jan 2018 · 17Comments · Source: tensorflow/models

Describe the problem

attention ocr could not converge, it is a bad project, the loss gradually increases. The dataset
is double checked, with 8202 classes.

Source code / logs

2018-01-22 17:41:30.359779: I tensorflow/core/kernels/logging_ops.cc:79] lbls[5021][2178][2010][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][127][657][6862][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:30.460314: I tensorflow/core/kernels/logging_ops.cc:79] lbls[209][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][7355][3947][4114][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:31.147930: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:31.148958: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:31.149403: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:31.149848: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:31.150309: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:31.150345: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:31.150855: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:31.150888: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:31.219392: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:31.219826: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:31.221575: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][][5554 5554 5554...][16][16][8 16]
2018-01-22 17:41:31.222405: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[7382][7382][5554][5554][5554][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][5554][][5554 5554 5554...][16][16][8 16]
2018-01-22 17:41:31.231981: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[31.543655]
2018-01-22 17:41:31.232015: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[34.622742]
INFO:tensorflow:global step 1: loss = 34.7808 (4.958 sec/step)
INFO 2018-01-22 17:41:31.000788: tf_logging.py: 82 global step 1: loss = 34.7808 (4.958 sec/step)
2018-01-22 17:41:31.844581: I tensorflow/core/kernels/logging_ops.cc:79] lbls[5][2038][4389][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][6802][2178][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:31.858722: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:31.860548: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:31.861751: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:31.861780: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:31.896185: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:31.897872: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][4200][][5554][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][1736][][5554 1736 1736...][16][16][8 16]
2018-01-22 17:41:31.902992: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[29.514463]
INFO:tensorflow:global step 2: loss = 29.6725 (0.226 sec/step)
INFO 2018-01-22 17:41:32.000061: tf_logging.py: 82 global step 2: loss = 29.6725 (0.226 sec/step)
2018-01-22 17:41:32.077037: I tensorflow/core/kernels/logging_ops.cc:79] lbls[2864][2648][5646][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][6376][2121][1203][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:32.089354: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:32.090356: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:32.091563: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:32.091597: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:32.127391: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:32.129693: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][][1736][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][5238][][1736 5238 5238...][16][16][8 16]
2018-01-22 17:41:32.135553: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[36.587006]
INFO:tensorflow:global step 3: loss = 36.7450 (0.164 sec/step)
INFO 2018-01-22 17:41:32.000228: tf_logging.py: 82 global step 3: loss = 36.7450 (0.164 sec/step)
2018-01-22 17:41:32.244753: I tensorflow/core/kernels/logging_ops.cc:79] lbls[2430][2331][2602][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][3750][6375][7678][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:32.256183: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:32.257116: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:32.258149: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:32.258288: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:32.294241: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:32.296386: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[4200][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][][7287][4200][4200][4200][4200][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][][7287 4200 4200...][16][16][8 16]
2018-01-22 17:41:32.301545: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[32.544807]
INFO:tensorflow:global step 4: loss = 32.7028 (0.171 sec/step)
INFO 2018-01-22 17:41:32.000402: tf_logging.py: 82 global step 4: loss = 32.7028 (0.171 sec/step)
2018-01-22 17:41:32.415413: I tensorflow/core/kernels/logging_ops.cc:79] lbls[5608][6993][5888][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][2509][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:41:32.427194: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:41:32.428143: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:41:32.429181: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:41:32.429210: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:41:32.465629: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:41:32.468201: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[5238][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][][5238][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][2443][][5238 2443 2443...][16][16][8 16]
2018-01-22 17:41:32.473847: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[33.510151]
INFO:tensorflow:global step 5: loss = 33.6682 (0.165 sec/step)
INFO 2018-01-22 17:41:32.000570: tf_logging.py: 82 global step 5: loss = 33.6682 (0.165 sec/step)
....
INFO 2018-01-22 17:45:08.000270: tf_logging.py: 82 global step 1292: loss = 181.0154 (0.156 sec/step)
2018-01-22 17:45:08.281153: I tensorflow/core/kernels/logging_ops.cc:79] lbls[1928][2675][6477][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][4963][269][6482][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:45:08.293001: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:45:08.294005: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:45:08.295178: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:45:08.295209: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:45:08.331004: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:45:08.332727: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][2633][][2633 2633 2633...][16][16][8 16]
2018-01-22 17:45:08.337821: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[255.01726]
INFO:tensorflow:global step 1293: loss = 255.2530 (0.158 sec/step)
INFO 2018-01-22 17:45:08.000429: tf_logging.py: 82 global step 1293: loss = 255.2530 (0.158 sec/step)
2018-01-22 17:45:08.441829: I tensorflow/core/kernels/logging_ops.cc:79] lbls[2921][540][6477][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][][1923][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][8202][]
2018-01-22 17:45:08.451025: I tensorflow/core/kernels/logging_ops.cc:79] Mixed_5d shape[8][5][72][288]
2018-01-22 17:45:08.451860: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn concat shape[8 5 72...][8 5 72...]
2018-01-22 17:45:08.452752: I tensorflow/core/kernels/logging_ops.cc:79] pool views fn out shape[8 72 1440]
2018-01-22 17:45:08.452780: I tensorflow/core/kernels/logging_ops.cc:79] pooled view shape[8 72 1440][8 5 72...]
2018-01-22 17:45:08.489094: I tensorflow/core/kernels/logging_ops.cc:79] logits shape[8 16 7917]
2018-01-22 17:45:08.490680: I tensorflow/core/kernels/logging_ops.cc:79] chars_logit[2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][2105][][2105][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][5888][][2105 5888 5888...][16][16][8 16]
2018-01-22 17:45:08.495944: I tensorflow/core/kernels/logging_ops.cc:79] sequence loss[210.91998]

awaiting response support

Source

chengmengli06

Most helpful comment

I think I may have fixed this issue.
My solution is to add an index 0 which represent ' ' (a space after /t) to the first line of my dic.txt.
Don't forget to check the file encoding, it must be 'UTF-8 without BOM'.

morningdip on 7 Jun 2018

👍2 ❤1

All 17 comments

@chengmengli06 Please focus the conversation on technical points, which we're happy to discuss. Calling the project "bad" and the code "stupid" isn't helping to make your point, and just seems like trolling.

I'm assuming you're referring to:
https://github.com/tensorflow/models/tree/master/research/attention_ocr

Do you have any further technical insight into your problem? So far all you've said is that you used custom data (in some unspecified fashion), and your loss is increasing.

I'm closing this out, but if you do have further technical points, just post them and I'll be happy to re-open.

CC @alexgorban

tatatodd on 26 Jan 2018

Sorry about my rude comments, I still have no idea, the loss starts at 30 and then increasing, while for the fsns dataset the loss starts at > 100 and then decreasing, I will check the code again later. Currently I switch to the implementation by https://github.com/da03/Attention-OCR. Is there any points that I missed when switching to my own dataset? I double-checked my data, everything seems to be good. My data has only one view. I paste my modifications, and my dataset scripts below. My char dict include 8022 Chinese chars. @tatatodd

[code_change.txt](https://github.com/tensorflow/models/files/1673385/code_change.txt)

chengmengli06 on 29 Jan 2018

@alexgorban do you have any ideas here?

tatatodd on 29 Jan 2018

i have same result, the loss continue increasing to 30000+ ... i double checked my dataset have no mistake found.

zdnet on 18 Mar 2018

I met with the same problem. After trying my own charset and the provided 134 charset, the problem still exists, where the loss keeps increasing and the inference gave junk output. Do you solve this problem finally? @chengmengli06 @tatatodd @zdnet

longzaitianguo on 22 Mar 2018

Plus, I tried my own dataset with the model from https://github.com/da03/Attention-OCR and the result was correct, which gave me 87% accuracy.

longzaitianguo on 22 Mar 2018

I'm sorry to hear that your experience issues with model convergence on your data. But I don't see how I can help without any additional information (code, data).

Please run all test and report the results

cd research/attention_ocr/python
python -m unittest discover -p  '*_test.py'

if all tests are green, please provide links to your fork and the data (at least some samples).

alexgorban on 29 Mar 2018

morningdip on 7 Jun 2018

👍2 ❤1

morningdip's solution fixed my custom dataset training, thanks @morningdip but can you tell me how you found this out?

edge24 on 8 Sep 2018

@morningdip Thanks for a wonderful solution. Can you please tell what are the considerations for creating charset file? I used ASCII codes to map letter. Is this correct? What is the correct way to assign id to ? I used 9999 and cant get it working.
Also, can you share how to create charset file using python? I am using following code but it seems there is something wrong with the encoding; unable to create UTF-8 without BOM:

with open("charset.txt", 'w', encoding="utf-8") as f:
f.write("\n".join(content_lines))

gagan144 on 9 Oct 2018

Closing this issue since its resolved. Feel free to reopen if have any follow up questions. Thanks!

ymodak on 28 Dec 2018

I think I may have fixed this issue.
My solution is to add an index 0 which represent ' ' (a space after /t) to the first line of my dic.txt.
Don't forget to check the file encoding, it must be 'UTF-8 without BOM'.

After adding space in start also loss is getting increased, training on 2000 images with batch size as 16

gnanaravindhan on 8 Feb 2019

Even for me, when I train attention OCR on my custom data of 8000 images, my loss increases. How does one solve this problem?

Guneetkaur03 on 17 Mar 2020

Even for me, when I train attention OCR on my custom data of 8000 images, my loss increases. How does one solve this problem?

Can you share more details? Like input data image size etc

pranaysharma on 20 Mar 2020

My input data consist of 8000 cropped number plate images. I have my own custom data with a single view only and my image size is (200,200). My batch size is 16 and I have tried changing the learning rate value to help the model converge. But that didn't help much.

Any guidance will be appreciated.

Guneetkaur03 on 20 Mar 2020

I think I may have fixed this issue.
My solution is to add an index 0 which represent ' ' (a space after /t) to the first line of my dic.txt.
Don't forget to check the file encoding, it must be 'UTF-8 without BOM'.

How did you create your record file with keeping index 0 as ' '??
Or after creating the record file you have manually added the index 0 as ' '??

R1234A on 13 Apr 2020

I think I may have fixed this issue.
My solution is to add an index 0 which represent ' ' (a space after /t) to the first line of my dic.txt.
Don't forget to check the file encoding, it must be 'UTF-8 without BOM'.

How did you create your record file with keeping index 0 as ' '??
Or after creating the record file you have manually added the index 0 as ' '??

Hey, in the charset label or your dict.txt file, you manually add the index 0 as ' '.

Guneetkaur03 on 13 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings