Models: Corrupt JPEG data: 245 extraneous bytes before marker 0xd9

Created on 11 Aug 2017 · 40Comments · Source: tensorflow/models

System information

What is the top-level directory of the model you are using: tensorflow/models
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 x64
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 1.2.1
Bazel version (if compiling from source):
CUDA/cuDNN version: 8.0 / 5.1
GPU model and memory: GeForce GTX 1060 6GB
Exact command to reproduce: python object_detection\eval.py --logtostderr --checkpoint_dir=C:Users\TVS\tensorflow\models\out\ --eval_dir=C:Users\TVS\tensorflow\models\eval\ --pipeline_config_path=C:Users\TVS\tensorflow\models\object_detection\samples\configs\faster_rcnn_resnet101_pets.config

Describe the problem

When training the Oxford-IIIT Pet Dataset from Scratch (without any finetune checkpoint from COCO set), I get the following error message while running the eval.py script:

Corrupt JPEG data: 245 extraneous bytes before marker 0xd9

I have build the TFRecord files with the script provided in the object detection api, but get the same for an PASCAL VOC like own dataset. I saw some guys having similar problems related to libjpeg, could it be that this error is related to libjpeg?

I am sure that my TFRecords are valid, because I can reconstruct valid Image and Annotation data from it.

Since I did not write any own code I suspect this to be a Bug.

research awaiting model gardener support

Source

RobinBaumann

👍9

Most helpful comment

I had the same error with the pet dataset, but it is now working. I first went through the images folder and found that the following files had the .jpg extension but were actually .png files:

egyptian_mau_14, 139, 145, 156, 167, 177, 186, 191; abyssinian_5, 34.

I still had the error after deleting these files and .xml data. After running "jpeginfo" in the images folder I found that chihuahua_121 and beagle_116 were also corrupted, even though they displayed correctly. I deleted these files as well and then rebuilt the tfrecords as explained in the tutorial. The corrupt jpeg error does not display now and everything seems to be working.

gpeier on 18 Nov 2017

👍12

All 40 comments

/CC @tombstone

reedwm on 14 Aug 2017

I got this error too, I also tested all the images and they were good. Something fishy in the jpeg encoder/decoder ?

Mistobaan on 15 Aug 2017

I got exactly the same the error on Ubuntu 14.04. Have you solved it? @RobinBaumann @Mistobaan

lancejchen on 28 Sep 2017

im guessing you all used the generate_tf_record_pet python script to create the record files?
if so, i believe this error is caused by setting image/format on
'image/format': dataset_util.bytes_feature

originally it is hardcoded to set to JPEG, however I also had PNGs, so it failed when running on cloud, i switch this dynamically determine which image format.
`byte_flag = 'jpeg'

if '.png' in os.path.basename(img_path):
byte_flag = 'png'`

seahawks8 on 1 Oct 2017

I received this message when I used a Logitech 270, but this doesn't happen with others web cams neither default web cam.

robemorin on 9 Oct 2017

👍1

Make sure when you extract the images from the camera you set the image/format correctly in tf record

seahawks8 on 9 Oct 2017

In the pet data set, there is no png files, but only jpeg files. It seems there is still no valid solution from the discussion so far. Any one has any working solutions?

ybsave on 30 Oct 2017

@ybsave the solution would be to dynamically check the image/format, instead of having `byte_flag = 'jpeg' always set to JPEG

seahawks8 on 30 Oct 2017

but if we are referring to only the pet dataset, than no issues.
people only experience issues when trying to use with other datasets.

seahawks8 on 30 Oct 2017

@ybsave are you having the issue using only the pet data set and their annotations? or did you add/modify any additional annotations or images?

seahawks8 on 30 Oct 2017

I have exactly the same issue when working on the pet data set. I have not work on other data sets yet. I use the official demo codes for tfrecord generation, training, and testing. No change at any parts.

It shows: "Corrupt JPEG data: 245 extraneous bytes before marker 0xd9"

After a while, it shows:"RuntimeWarning: invalid value encountered in true_divide
num_images_correctly_detected_per_class / num_gt_imgs_per_class)"

ybsave on 30 Oct 2017

I had the same error with the pet dataset, but it is now working. I first went through the images folder and found that the following files had the .jpg extension but were actually .png files:

egyptian_mau_14, 139, 145, 156, 167, 177, 186, 191; abyssinian_5, 34.

gpeier on 18 Nov 2017

👍12

Removed the above files @gpeier mentioned. Still getting:

Corrupt JPEG data: 240 extraneous bytes before marker 0xd9

tansut on 18 Nov 2017

I also removed the .xmls and the entries in list.txt, test.txt, and trainval.txt. I am very to new to tensorboard so I really can't tell if those were necessary steps. Also, consider installing jpeginfo and running it in the images folder and then checking to see if you missed any corrupt jpegs.

gpeier on 18 Nov 2017

I would recommend setting a “try and except” statement on the specific line or whatever means to narrow down what file is causing that issue. Could be caused by JPG instead of JPEG or PNG. How did you acquire the images you are trying to train. Many people who download images from google via some script, always end up with corrupted or bad images with wrong extensions

seahawks8 on 18 Nov 2017

Images are from the oxford/pets website as described in the tutorial. I suspect that either some files in the archive were already corrupted or were corrupted during download. Definitely not a fast download considering I only got 500kb/s on a 1000mb/s google fiber connection.

gpeier on 19 Nov 2017

I confirmed that there is no download problem, too. It seems that the problem is the pet data set's own problem.

ybsave on 20 Nov 2017

I got this error when running eval.py on pet_val.record generated by create_pet_tf_record.py

2017-12-07 15:07:10.069475: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-07 15:07:10.069513: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-12-07 15:07:10.069523: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-07 15:07:10.069531: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
INFO:tensorflow:Restoring parameters from /Users/visethsean/Downloads/models-master/research/object_detection/models/model/train/model.ckpt-0
INFO:tensorflow:Restoring parameters from /Users/visethsean/Downloads/models-master/research/object_detection/models/model/train/model.ckpt-0
Corrupt JPEG data: 240 extraneous bytes before marker 0xd9

Is there anyone already solved the problem?

visethchapman on 8 Dec 2017

I am able to reproduce this problem. Can't figure out and waiting for answer.

rzhai on 6 Feb 2018

😕1

during train, I noticed "Corrupt JPEG data: premature end of data segment" in middle of training steps

rzhai on 8 Feb 2018

Using following script to solve the problem of image format and corrupt before using create_pet_tf_record.py, then everything will be ok.

import io
import os
import sys

import tensorflow as tf
import PIL

def main(argv):
    path_images = './images'
    filenames_src = tf.gfile.ListDirectory(path_images)
    for filename_src in filenames_src:
        stem, extension = os.path.splitext(filename_src)
        if (extension.lower() != '.jpg'): continue

        pathname_jpg = '{}/{}'.format(path_images, filename_src)
        with tf.gfile.GFile(pathname_jpg, 'rb') as fid:
            encoded_jpg = fid.read(4)
        # png
        if(encoded_jpg[0] == 0x89 and encoded_jpg[1] == 0x50 and encoded_jpg[2] == 0x4e and encoded_jpg[3] == 0x47):
            # copy jpg->png then encode png->jpg
            print('png:{}'.format(filename_src))
            pathname_png = '{}/{}.png'.format(path_images, stem)
            tf.gfile.Copy(pathname_jpg, pathname_png, True)
            PIL.Image.open(pathname_png).convert('RGB').save(pathname_jpg, "jpeg")   
        # gif
        elif(encoded_jpg[0] == 0x47 and encoded_jpg[1] == 0x49 and encoded_jpg[2] == 0x46):
            # copy jpg->gif then encode gif->jpg
            print('gif:{}'.format(filename_src))
            pathname_gif = '{}/{}.gif'.format(path_images, stem)
            tf.gfile.Copy(pathname_jpg, pathname_gif, True)
            PIL.Image.open(pathname_gif).convert('RGB').save(pathname_jpg, "jpeg")   
        elif(filename_src == 'beagle_116.jpg' or filename_src == 'chihuahua_121.jpg'):
            # copy jpg->jpeg then encode jpeg->jpg
            print('jpeg:{}'.format(filename_src))
            pathname_jpeg = '{}/{}.jpeg'.format(path_images, stem)
            tf.gfile.Copy(pathname_jpg, pathname_jpeg, True)
            PIL.Image.open(pathname_jpeg).convert('RGB').save(pathname_jpg, "jpeg")   
        elif(encoded_jpg[0] != 0xff or encoded_jpg[1] != 0xd8 or encoded_jpg[2] != 0xff):
            print('not jpg:{}'.format(filename_src))

if __name__ == "__main__":
    sys.exit(int(main(sys.argv) or 0))

junfengchen2016 on 3 Jul 2018

👍6 😄1

I have also met this strange problem. Those images can be cv2.imread in a computer A, but when copied to another computer B with similar system and environment, the warning occurs. I first find out those images with that warning, then just re-save those images in B, then the problem is gone.

    im = cv2.imread(fname)
    cv2.imwrite(fname, im)

Remember2018 on 7 Oct 2018

I'm getting a similar error during training using Linux, But when I use Windows everything works fine. I tried several times. I checked for corrupted images by loading images one by one. I couldn't find any corrupted images.

Tensorflow 1.12.0, keras==2.2.4 cuda:9.0-cudnn7-devel-ubuntu16.04

6698/12369 [===============>..........Corrupt JPEG data: 1 extraneous bytes before marker 0xd9
6789/1236Corrupt JPEG data: 1 extraneous bytes before marker 0xd9
6155/12369 [=============>................] -Corrupt JPEG data: 1 extraneous bytes before marker 0xd9
3439/12369 [=======>......................] - ETA: 23:55 - loCorrupt JPEG data: 1 extraneous bytes before marker 0xd9

Anyone knows what's the problem?

Aravinda89 on 22 Nov 2018

@Aravinda89 that is really odd. how did you verify the images one by one? I suggest using running a loop and using the following. My first guest is one of the images corrupted during the file transfer, anyways give that code below a shot on the Linux box.

from PIL import Image
img = Image.open(filename)
img.verify()

seahawks8 on 22 Nov 2018

@seahawks8 Thanks for replying. I already tried. I installed TensorFlow 1.10 instead of 1.11 or 1.12 then this error disappeared. So can this error have something to do with TensorFlow?

Aravinda89 on 26 Nov 2018

It definitely could. I have had my fair share of battles with Tensorflow this past weekend actually.

perhaps a different approach would be to running a tool such as Mogrify to check for those extra bytes in Linux. there are a few examples (links) I found below of how to do this, please give it a shot and let me know if it works.
https://www.imagemagick.org/discourse-server/viewtopic.php?t=23971
https://stackoverflow.com/questions/24805500/can-i-fix-photos-with-corrupt-jpeg-data

seahawks8 on 26 Nov 2018

@seahawks8 Thanks. I'm using Tensorflow 1.10 right now, it trains fine for some epochs (sometimes 15-20 epochs) without the error, then suddenly stops because of the error shows up. But feels like better than TF 1.11 or 1.12. , In those versions I couldn't even train one complete epoch. How about u ?any progress? Did u find any exact reason for error? it can be error related to tensorflow..

Aravinda89 on 28 Nov 2018

yea I had a few errors, recently I was training a MASK RCNN COCO model, and the first error I had was a bounding box larger than an image, solved that as it was a data error. I was using runtime version 1.10 on google cloud, python 3.5 and I kept getting issues. as soon as I switched to python 2.7 (default) in google cloud, it was fine.

seahawks8 on 28 Nov 2018

same eorrs; is there a solution?

NorwayLobster on 2 Dec 2018

after trying all the solutions above, my images are still not working. I try to read all the training images by Image from PIL, remove all images with warning and directly save other images to the original path with quality 90, then the problem seem to be solved for my case. I guess by saving with PIL, redundant bytes may be removed.

crintoYL on 26 Dec 2018

👍1

I received this message when I used a Logitech 270, but this doesn't happen with others web cams neither default web cam.

I have the same problem with live stream video from Logitech 270

hongsamvo on 24 May 2019

I have the same problem when using ssd_mobilenet_v1_coco_2018_01_28 but when I tried ssd_mobilenet_v1_coco_11_06_2017 there was no errors at all.

Shaul-Z on 24 Jul 2019

@Shaul-Z
Are you still stuck on this issue

Tomlin0110 on 12 Sep 2019

Same issue as this https://github.com/opencv/opencv/issues/9477#issuecomment-506404396
Any idea about using OpenCV with raspberry pi. I just wanna simply shut up the warning.

TeamATR on 24 Jan 2020

@Shaul-Z
Are you still stuck on this issue

I fixed the corrupt images with Bad Peggy. Strongly recommended!

Shaul-Z on 24 Jan 2020

Hello, I have the same problem as you all when training on kangaroo dataset on Google Colab.

@Shaul-Z can you please tell me if can I run Bad Peggy in Colab and if yes, then how?

@seahawks8 can you please tell me if can I run Mogrify in Colab and if yes, then how?

@junfengchen2016 can you please explain me how can I use your code on kangaroo dataset

Monster-Gaming-Studios on 2 Apr 2020

Hello, I have the same problem as you all when training on kangaroo dataset on Google Colab.

@Shaul-Z can you please tell me if can I run Bad Peggy in Colab and if yes, then how?

@seahawks8 can you please tell me if can I run Mogrify in Colab and if yes, then how?

@junfengchen2016 can you please explain me how can I use your code on kangaroo dataset

@Monster-Gaming-Studios most of your questions can be solved with a quick google. If you have an issue please post your stack trace and provide more information.

seahawks8 on 3 Apr 2020

@seahawks8 While using the kangaroo dataset, I can do cv2.imread() without any error. But then also to be on the safe side, I rewrited all the images by using the following code

import os
import cv2
dir = '/content/kangaroo/images'
for file in os.listdir(dir):
    if file.endswith('.jpg'):
        img = cv2.imread(dir + '/' + file)
        os.remove(dir + '/' + file)
        cv2.imwrite(dir + '/' + file , img)
        print('rewrited image:' + dir + '/' + file)

I didn't got any error in running this code. I generated the tfrecord and trained the model with the script at models/research/object_detection/main_model.py by using faster_rcnn_inception_v2_coco_2018_01_28.tar.gz as my base model. I am getting the following output:

WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From /content/models/research/object_detection/model_main.py:110: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.

WARNING:tensorflow:From /content/models/research/object_detection/utils/config_util.py:102: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

W0403 11:04:47.941774 140391857076096 module_wrapper.py:139] From /content/models/research/object_detection/utils/config_util.py:102: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

WARNING:tensorflow:From /content/models/research/object_detection/model_lib.py:628: The name tf.logging.warning is deprecated. Please use tf.compat.v1.logging.warning instead.

W0403 11:04:47.946399 140391857076096 module_wrapper.py:139] From /content/models/research/object_detection/model_lib.py:628: The name tf.logging.warning is deprecated. Please use tf.compat.v1.logging.warning instead.

WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W0403 11:04:47.946679 140391857076096 model_lib.py:629] Forced number of epochs for all eval validations to be 1.
WARNING:tensorflow:From /content/models/research/object_detection/utils/config_util.py:488: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.

W0403 11:04:47.946868 140391857076096 module_wrapper.py:139] From /content/models/research/object_detection/utils/config_util.py:488: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.

INFO:tensorflow:Maybe overwriting train_steps: None
I0403 11:04:47.947014 140391857076096 config_util.py:488] Maybe overwriting train_steps: None
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0403 11:04:47.947142 140391857076096 config_util.py:488] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0403 11:04:47.947262 140391857076096 config_util.py:488] Maybe overwriting sample_1_of_n_eval_examples: 1
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0403 11:04:47.947416 140391857076096 config_util.py:488] Maybe overwriting eval_num_epochs: 1
INFO:tensorflow:Maybe overwriting load_pretrained: True
I0403 11:04:47.947547 140391857076096 config_util.py:488] Maybe overwriting load_pretrained: True
INFO:tensorflow:Ignoring config override key: load_pretrained
I0403 11:04:47.947665 140391857076096 config_util.py:498] Ignoring config override key: load_pretrained
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
W0403 11:04:47.948395 140391857076096 model_lib.py:645] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
INFO:tensorflow:create_estimator_and_inputs: use_tpu False, export_to_tpu False
I0403 11:04:47.948548 140391857076096 model_lib.py:680] create_estimator_and_inputs: use_tpu False, export_to_tpu False
INFO:tensorflow:Using config: {'_model_dir': '/content/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7faf42b35ac8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
I0403 11:04:47.949143 140391857076096 estimator.py:212] Using config: {'_model_dir': '/content/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7faf42b35ac8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
WARNING:tensorflow:Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7faf42b49048>) includes params argument, but params are not passed to Estimator.
W0403 11:04:47.949395 140391857076096 model_fn.py:630] Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7faf42b49048>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Not using Distribute Coordinator.
I0403 11:04:47.950266 140391857076096 estimator_training.py:186] Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
I0403 11:04:47.950525 140391857076096 training.py:612] Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
I0403 11:04:47.950808 140391857076096 training.py:700] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.


INFO:tensorflow:Done calling model_fn.
I0403 11:05:16.342241 140391857076096 estimator.py:1150] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0403 11:05:16.343708 140391857076096 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I0403 11:05:18.796912 140391857076096 monitored_session.py:240] Graph was finalized.
2020-04-03 11:05:18.797341: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-04-03 11:05:18.801818: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2020-04-03 11:05:18.802026: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0xf42b2c0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-04-03 11:05:18.802061: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
INFO:tensorflow:Restoring parameters from /content/model.ckpt-0
I0403 11:05:18.803865 140391857076096 saver.py:1284] Restoring parameters from /content/model.ckpt-0

INFO:tensorflow:Running local_init_op.
I0403 11:05:20.131512 140391857076096 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0403 11:05:20.318775 140391857076096 session_manager.py:502] Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /content/model.ckpt.
I0403 11:05:26.670471 140391857076096 basic_session_run_hooks.py:606] Saving checkpoints for 0 into /content/model.ckpt.
INFO:tensorflow:loss = 1.4689393, step = 1
I0403 11:05:36.717814 140391857076096 basic_session_run_hooks.py:262] loss = 1.4689393, step = 1
INFO:tensorflow:global_step/sec: 0.203829
I0403 11:13:47.325588 140391857076096 basic_session_run_hooks.py:692] global_step/sec: 0.203829
INFO:tensorflow:loss = 0.6350638, step = 101 (490.609 sec)
I0403 11:13:47.326665 140391857076096 basic_session_run_hooks.py:260] loss = 0.6350638, step = 101 (490.609 sec)
INFO:tensorflow:Saving checkpoints for 123 into /content/model.ckpt.
I0403 11:15:29.243787 140391857076096 basic_session_run_hooks.py:606] Saving checkpoints for 123 into /content/model.ckpt.
INFO:tensorflow:Calling model_fn.
I0403 11:15:31.186844 140391857076096 estimator.py:1148] Calling model_fn.
INFO:tensorflow:Scale of 0 disables regularizer.
I0403 11:15:33.066133 140391857076096 regularizers.py:98] Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
I0403 11:15:33.090975 140391857076096 regularizers.py:98] Scale of 0 disables regularizer.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0403 11:15:33.091600 140391857076096 convolutional_box_predictor.py:151] depth of additional conv before box predictor: 0
INFO:tensorflow:Scale of 0 disables regularizer.
I0403 11:15:34.593629 140391857076096 regularizers.py:98] Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
I0403 11:15:34.613992 140391857076096 regularizers.py:98] Scale of 0 disables regularizer.


INFO:tensorflow:Done calling model_fn.
I0403 11:15:37.179416 140391857076096 estimator.py:1150] Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2020-04-03T11:15:37Z
I0403 11:15:37.200857 140391857076096 evaluation.py:255] Starting evaluation at 2020-04-03T11:15:37Z
INFO:tensorflow:Graph was finalized.
I0403 11:15:37.786732 140391857076096 monitored_session.py:240] Graph was finalized.
INFO:tensorflow:Restoring parameters from /content/model.ckpt-123
I0403 11:15:37.788249 140391857076096 saver.py:1284] Restoring parameters from /content/model.ckpt-123
INFO:tensorflow:Running local_init_op.
I0403 11:15:38.594528 140391857076096 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0403 11:15:38.740363 140391857076096 session_manager.py:502] Done running local_init_op.
2020-04-03 11:17:50.294897: E tensorflow/core/lib/jpeg/jpeg_mem.cc:323] Premature end of JPEG data. Stopped at line 382/393
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node Dataset_map_process_fn_16122}} Invalid JPEG data or crop window, data size 86603
     [[{{node case/cond/cond_jpeg/DecodeJpeg}}]]
     [[IteratorGetNext]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/content/models/research/object_detection/model_main.py", line 110, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/content/models/research/object_detection/model_main.py", line 105, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/training.py", line 473, in train_and_evaluate
    return executor.run()
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/training.py", line 613, in run
    return self.run_local()
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/training.py", line 714, in run_local
    saving_listeners=saving_listeners)
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/estimator.py", line 370, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/estimator.py", line 1161, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/estimator.py", line 1195, in _train_model_default
    saving_listeners)
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/estimator.py", line 1494, in _train_with_estimator_spec
    _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 754, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1259, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1360, in run
    raise six.reraise(*original_exc_info)
  File "/usr/local/lib/python3.6/dist-packages/six.py", line 693, in reraise
    raise value
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1345, in run
    return self._sess.run(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1426, in run
    run_metadata=run_metadata))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/basic_session_run_hooks.py", line 594, in after_run
    if self._save(run_context.session, global_step):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/basic_session_run_hooks.py", line 619, in _save
    if l.after_save(session, step):
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/training.py", line 519, in after_save
    self._evaluate(global_step_value)  # updates self.eval_result
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/training.py", line 539, in _evaluate
    self._evaluator.evaluate_and_export())
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/training.py", line 920, in evaluate_and_export
    hooks=self._eval_spec.hooks)
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/estimator.py", line 480, in evaluate
    name=name)
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/estimator.py", line 522, in _actual_eval
    return _evaluate()
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/estimator.py", line 511, in _evaluate
    output_dir=self.eval_dir(name))
  File "/tensorflow-1.15.2/python3.6/tensorflow_estimator/python/estimator/estimator.py", line 1619, in _evaluate_run
    config=self._session_config)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/evaluation.py", line 272, in _evaluate_once
    session.run(eval_ops, feed_dict)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 754, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1259, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1360, in run
    raise six.reraise(*original_exc_info)
  File "/usr/local/lib/python3.6/dist-packages/six.py", line 693, in reraise
    raise value
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1345, in run
    return self._sess.run(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1418, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1176, in run
    return self._sess.run(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  Invalid JPEG data or crop window, data size 86603
     [[{{node case/cond/cond_jpeg/DecodeJpeg}}]]
     [[IteratorGetNext]]

Also I am getting these new files:
-model.ckpt-0.data-0000-of-0001
-model.ckpt-0.index
-model.ckpt-0.meta
-model.ckpt-123.data-0000-of-0001
-model.ckpt-123.index
-model.ckpt-123.meta
-events.out.tfevents.1585985273.a31a03779ab7
-checkpoint

Please help me

Monster-Gaming-Studios on 3 Apr 2020

tensorflow 2.1 has the issue when run resnet50 https://adventuresinmachinelearning.com/transfer-learning-tensorflow-2/

goog on 30 Apr 2020

Using following script to solve the problem of image format and corrupt before using create_pet_tf_record.py, then everything will be ok.

import io
import os
import sys

import tensorflow as tf
import PIL

def main(argv):
    path_images = './images'
    filenames_src = tf.gfile.ListDirectory(path_images)
    for filename_src in filenames_src:
        stem, extension = os.path.splitext(filename_src)
        if (extension.lower() != '.jpg'): continue

        pathname_jpg = '{}/{}'.format(path_images, filename_src)
        with tf.gfile.GFile(pathname_jpg, 'rb') as fid:
            encoded_jpg = fid.read(4)
        # png
        if(encoded_jpg[0] == 0x89 and encoded_jpg[1] == 0x50 and encoded_jpg[2] == 0x4e and encoded_jpg[3] == 0x47):
            # copy jpg->png then encode png->jpg
            print('png:{}'.format(filename_src))
            pathname_png = '{}/{}.png'.format(path_images, stem)
            tf.gfile.Copy(pathname_jpg, pathname_png, True)
            PIL.Image.open(pathname_png).convert('RGB').save(pathname_jpg, "jpeg")   
        # gif
        elif(encoded_jpg[0] == 0x47 and encoded_jpg[1] == 0x49 and encoded_jpg[2] == 0x46):
            # copy jpg->gif then encode gif->jpg
            print('gif:{}'.format(filename_src))
            pathname_gif = '{}/{}.gif'.format(path_images, stem)
            tf.gfile.Copy(pathname_jpg, pathname_gif, True)
            PIL.Image.open(pathname_gif).convert('RGB').save(pathname_jpg, "jpeg")   
        elif(filename_src == 'beagle_116.jpg' or filename_src == 'chihuahua_121.jpg'):
            # copy jpg->jpeg then encode jpeg->jpg
            print('jpeg:{}'.format(filename_src))
            pathname_jpeg = '{}/{}.jpeg'.format(path_images, stem)
            tf.gfile.Copy(pathname_jpg, pathname_jpeg, True)
            PIL.Image.open(pathname_jpeg).convert('RGB').save(pathname_jpg, "jpeg")   
        elif(encoded_jpg[0] != 0xff or encoded_jpg[1] != 0xd8 or encoded_jpg[2] != 0xff):
            print('not jpg:{}'.format(filename_src))

if __name__ == "__main__":
    sys.exit(int(main(sys.argv) or 0))

Thanks. By using this code, I finally got the below findings:
(0) This dataset is 'oxford_iiit_pet'.
(1) 'beagle_116.jpg' & 'chihuahua_121.jpg' got corrupted.
(2) 'Abyssinian_5.jpg', 'Egyptian_Mau_14.jpg', 'Egyptian_Mau_156.jpg' & 'Egyptian_Mau_186.jpg' should be png files rather than jpg files.
(3) 'Abyssinian_34', 'Egyptian_Mau_139', 'Egyptian_Mau_145', 'Egyptian_Mau_167', 'Egyptian_Mau_177' & 'Egyptian_Mau_191' should be 8-bit gif files rather than jpg files.