Short description
CLEVR is a Visual Question Answering dataset but the questions are missing, so user needs to jump through a bunch of hoops to download and parse them
Environment information
tensorflow-datasets/tfds-nightly version: 1.1tensorflow/tensorflow-gpu/tf-nightly/tf-nightly-gpu version: gpu 2b1Reproduction instructions
ds_test, ds_train, ds_validation = tfds.load(name="clevr", split=['test', 'train', 'validation'])
for element in ds_train:
tf.print(element)
Link to logs
[[115 114 114]
[117 116 114]
[115 114 113]
...
[156 153 147]
[156 153 147]
[156 153 148]]],
'objects': {'3d_coords': [[0.438755214 -2.79463482 0.7]
[2.59563708 1.5761956 0.35]
[-2.9369 2.04038 0.35]
...
[2.28646755 2.97107983 0.7]
[-1.16437554 2.7382021 0.7]
[2.27009153 0.293746442 0.7]],
'color': [1 6 0 ... 3 2 2],
'material': [0 0 1 ... 0 0 0],
'pixel_coords': [[141 187 9.35730934]
[383 166 10.7077045]
Expected behavior
i expected a visual question answering dataset to have questions to answer
Additional context
I confirm only images are present currently. It should probably be easy to add a 'question': tfds.features.Text() field in: https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/image/clevr.py
@Conchylicultor : I can work on this issue! but for testing would it download the complete dataset of around 20GB? or how can I test this after implementing?
@dhirensr Thanks for looking into this. For testing, you should generate the completely dataset at least once to ensure it is generated properly. The tests.py however just try the generation on fake data (less than 5 examples). See the doc for more details : https://www.tensorflow.org/datasets/add_dataset#testing_mydataset
Hello! I would like to work on this issue. This will be my first contribution. Kindly guide me on this. Thanks.
Thanks.
The dataset is implemented in https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/image/clevr.py
Have a look at our guide to understand the code: https://www.tensorflow.org/datasets/add_dataset
@divij30bajaj are you still working on this issue?
Hey @divij30bajaj if you are no longer working on this issue I would love to take a look at it
@Conchylicultor please review #1583.
Thank you
Most helpful comment
I confirm only images are present currently. It should probably be easy to add a
'question': tfds.features.Text()field in: https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/image/clevr.py