Datasets: Loading SUN397 crashed

Created on 21 Dec 2020  路  8Comments  路  Source: tensorflow/datasets

/!\ IF YOU WANT PEOPLE TO HELP YOU, PLEASE GIVE AS MUCH DETAIL AS POSSIBLE, INCLUDING THE FULL STACKTRACE AND CODE SNIPPET

Hi there i'am trying to load the SUN397 Dataset to make a short image Classification NN. And when loading the Dataset I am getting an Error. When Tensorflow tries to load around the 55910-55920 Example, the Error: ValueError: Cannot take the length of shape with unknown rank Raises.

Windows 10 Pro
Python Version: 3.8.6
tfds Version: 4.1.0
tfds-nightly Version:
tensorflow Version: 2.4.0

Here is my Code, that I use to load the sun397 Dataset
batch_size = 128

(train_ds, val_ds), info = tfds.load("sun397", split=["train[:55900]", "validation"], as_supervised=True, with_info=True)

Model

train_ds = train_ds.map(lambda img, label: (tf.image.resize(img, [img_width, img_height]) / 255.0, label)).shuffle(1024).batch(batch_size)
val_ds = val_ds.map(lambda img, label: (tf.image.resize(img, [img_width, img_height]) / 255.0, label)).batch(batch_size)

Here are my Logs:

Generating train examples...: 55816 examples [16:46, 113.83 examples/s]
Generating train examples...: 55830 examples [16:46, 117.48 examples/s]
Generating train examples...: 55843 examples [16:46, 117.46 examples/s]
Generating train examples...: 55856 examples [16:46, 81.19 examples/s] 
Generating train examples...: 55866 examples [16:46, 53.75 examples/s]
Generating train examples...: 55874 examples [16:47, 52.12 examples/s]
Generating train examples...: 55881 examples [16:47, 42.26 examples/s]
Generating train examples...: 55887 examples [16:47, 46.28 examples/s]
Generating train examples...: 55893 examples [16:47, 44.83 examples/s]
Generating train examples...: 55907 examples [16:47, 56.20 examples/s]
Generating train examples...: 55915 examples [16:48, 39.78 examples/s]
Generating train examples...: 55922 examples [16:48, 44.33 examples/s]WARNING:absl:Image /t/track/outdoor/sun_aophkoiosslinihb.jpg could not be decoded by OpenCV, falling back to TF
CRITICAL:absl:Image /t/track/outdoor/sun_aophkoiosslinihb.jpg could not be decoded by Tensorflow

                                                                      Traceback (most recent call last):
  File "C:/Users/Jann/OneDrive - UMB AG/Schule/IT/Semester 3/122/Projekt/Code/Test/TestModel.py", line 31, in <module>
    (train_ds, val_ds), info = tfds.load("sun397", split=["train[:55900]", "validation"], as_supervised=True, with_info=True)
  File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\load.py", line 328, in load
    dbuilder.download_and_prepare(**download_and_prepare_kwargs)
  File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\dataset_builder.py", line 432, in download_and_prepare
    self._download_and_prepare(
  File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\dataset_builder.py", line 1158, in _download_and_prepare
    split_info_futures = [
  File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\dataset_builder.py", line 1159, in <listcomp>
    split_builder.submit_split_generation(  # pylint: disable=g-complex-comprehension
  File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\split_builder.py", line 295, in submit_split_generation
    return self._build_from_generator(**build_kwargs)
  File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\split_builder.py", line 354, in _build_from_generator
    for key, example in utils.tqdm(
  File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tqdm\std.py", line 1167, in __iter__
    for obj in iterable:
  File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\image_classification\sun.py", line 283, in _generate_examples
    image = _process_image_file(
  File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\image_classification\sun.py", line 123, in _process_image_file
    image = _decode_image(fobj, session, filename=filename)
  File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\image_classification\sun.py", line 104, in _decode_image
    if len(image.shape) == 4:  # rank=4 -> rank=3
  File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 848, in __len__
    raise ValueError("Cannot take the length of shape with unknown rank.")
ValueError: Cannot take the length of shape with unknown rank.
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: Ignoring invalid time value

Process finished with exit code 1

Thanks for anyone, who has an idea whats happening. I was trying a lot and reading through the Documentation but dindnt find anything so thanks a lot.

bug contributions welcome

All 8 comments

can you past your complete code and the error message.

Thank you for sharing the logs:

CRITICAL:absl:Image /t/track/outdoor/sun_aophkoiosslinihb.jpg could not be decoded by Tensorflow

It seems to be raised by:

https://github.com/tensorflow/datasets/blob/721b0d8ff937dd6cf97604e1447acc84393290a9/tensorflow_datasets/image_classification/sun.py#L101

I'm not sure why this is raised only now, but not before. Does this means new TF version are unable to decode images which were previously correctly decoded with tf.image.decode_image ? Is it system dependent (only on windows) ?

@rohit11544 I did Post my Code above. But here is the line, that I used in my Code.
And I run the Code with only this line in it.
Line:
(train_ds, val_ds), info = tfds.load("sun397", split=["train[:55900]", "validation"], as_supervised=True, with_info=True)

The Error you asked is above as well.

@FPGSchiba The issue is with the split argument, the argument takes only "train" , "test" but not "validation" you need to replace the "validation" with "test" then it will work. I have run the same code on mnist dataset see this screenshots.

1) split=["train[:55900]", "test"]

Screenshot (42)

2) split=["train[:55900]", "validation"]

Screenshot (43)

@rohit11544 I don't think this works for SUN397 in your example you are loading mnist and not SUN397.
And my Error has nothing to do with Splitting did you even read my first post?

@FPGSchiba Firstly I am sorry to run your code on mnist dataset rather than SUN397. I completely agree with your point but the split argument doesn't depend on the dataset right? so I suggested you to replace the "validation" with "test" and try, if it works then its fine else then we can try to find another solution.

@rohit11544 sorry for my harsh words, but i am sure that the Splitting is individual for every Dataset, but I tried it either way and it didn't work. The log output was the same as in the first post.
That's the code I used to run:
(train_ds, val_ds), info = tfds.load("sun397", split=["train", "test"], as_supervised=True, with_info=True)

Thanks for your help

@FPGSchiba Hey that's completely fine. I don't know that Splitting is individual for every Dataset so I suggested that way. Ok we will find another way.

Was this page helpful?
1 / 5 - 1 ratings