Folks who would also like to see this dataset in tensorflow/datasets, please +1/thumbs-up so the developers can know which requests to prioritize.
As this is a good first issue would like to take this up 😀
There are 2 version of this dataset 32 * 32 images and 64 * 64 images should both be done ?
That's great @tabshaikh, thank you!
Yes, I think we should have both using "heavy" configuration, but you can start with just one.
I'll assign the issue to you as soon as you accept the collaborator invite! :smiley:
@rsepassi invite accepted would have a pr soon :)
Sounds good! Thank you!
@rsepassi I would love to collaborate with @tabshaikh and make a pull request for Downsampled ImageNet. Please assign it to me also.
Hi @Anupam-tripathi, thanks for your interest! Are you already working directly with @tabshaikh? If not, let's give him a chance to get a PR in. If he'd like the help, then please do work together to get something in!
No, I have not joined him till now but will surely contact him personally.
@Anupam-tripathi I would love to collaborate with you but the pr is almost done with a few changes left to do and hopefully, I have added the dataset correctly @rsepassi I would do a pr till mid next week as I will be going for an ML hackathon during the weekend, I had some question too would ask in the draft pr
Let us collaborate on adding a big dataset @Anupam-tripathi would be great to have a teammate in it
Sounds good. Thanks!
On Fri, Mar 1, 2019 at 3:08 PM Tabish Shaikh notifications@github.com
wrote:
@Anupam-tripathi https://github.com/Anupam-tripathi I would love to
collaborate with you but the pr is almost done with a few changes left to
do and hopefully, I have added the dataset correctly @rsepassi
https://github.com/rsepassi I would do a pr till mid next week as I
will be going for an ML hackathon during the weekend, I had some question
too would ask in the draft pr
Cheers :)—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tensorflow/datasets/issues/45#issuecomment-468842337,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABEGW1o0D_TGuZv0C1PAarkIl1Q9BwFcks5vSbLtgaJpZM4au74x
.
Ya, surely I will prove to be a good teammate.
@rodrigob @rsepassi the link for http://image-net.org/small/download.php does not contain the whole dataset of the downsampled imagenet nor does it contain the labels.
Further i found these links https://patrykchrabaszcz.github.io/Imagenet32/ for dataset details and this http://image-net.org/download-images which contains the whole dataset and
I could not understand in which dev kit of imagenet the labels are present as mentioned in this link https://patrykchrabaszcz.github.io/Imagenet32/ ?
Also the data requires login and is present in the form of pickle file which extracts into a dictionary
Can you help me how to proceed with this further :)
@tabshaikh - Yes, this dataset has only a subset of subsampled imagenet images and does NOT have labels.
This is on purpose - as it was used for autoregressive algorithms, that were generating the output images (rather than trying to predict the class).
Please download from the official link rather than from side-ones.
@cyfra okay cool
The idea of this ticket was to create a smaller version of imagenet that is small enough so that most people can prototype and experiment without having to worry about download time or disk-space.
I would suggest to go for 32x32 and 64x64 versions; ideally _with_ labels so that supervised training (Ã la MNIST and CIFAR) can also be used.
@rodrigob okay thanks :)
Any thoughts on doing the 128x128 version while you're at it?
Joel, looks like there are only 2 versions listed, 32 and 64: http://image-net.org/small/download.php
@joel-shor Do you have a link to 128x128?
@rsepassi no there are 4 versions actually 8x8, 16x16, 32x32, 64x64 here http://image-net.org/download-images. The link which you pointed out is incomplete as there is no labels for the same.
@joel-shor can you point me to 128x128 version :)
Sorry for the delay. The 128x128 imagenet, which is used in a number of state-of-the-art GANs (such as Self Attention GAN), can be found here: https://github.com/openai/improved-gan/blob/master/imagenet/convert_imagenet_to_records.py
If you were able to turn this in to a TFDS data set, you would be a hero!
@tabshaikh Have you moved on from this?
Has been added with https://github.com/tensorflow/datasets/pull/613. Closing this now
Most helpful comment
Sounds good! Thank you!