Reported by @mickypaganini
The classes attribute of EMNIST dataset does not take into account the split argument. From the original EMNIST dataset https://www.nist.gov/itl/products-and-services/emnist-dataset
There are six different splits provided in this dataset. A short summary of the dataset is provided below:
EMNIST ByClass: 814,255 characters. 62 unbalanced classes.
EMNIST ByMerge: 814,255 characters. 47 unbalanced classes.
EMNIST Balanced: 131,600 characters. 47 balanced classes.
EMNIST Letters: 145,600 characters. 26 balanced classes.
EMNIST Digits: 280,000 characters. 10 balanced classes.
EMNIST MNIST: 70,000 characters. 10 balanced classes.
As of now, classes always returns the default for MNIST
['0 - zero',
'1 - one',
'2 - two',
'3 - three',
'4 - four',
'5 - five',
'6 - six',
'7 - seven',
'8 - eight',
'9 - nine']
We need to make classes depend on split, and return the correct class names for each split.
Hi @fmassa
I tried fixing the bug and have sent a PR (#1736 ). Request you to review when you get some free time.
Thanks :)
Thanks a lot for the PR @Gokkulnath !
Thank you for fixing this.
how can I grab this fix?
pip says im up to date with torchvision 0.5.0
@anandijain
Hi
I dont see the changes merged to 0.5 Release Link but it is available under master branch.
I think it will be included in one of the subsequent releases. Meanwhile feel free to replace the mnist.py from the master version locally.
thanks for the quick reply @Gokkulnath
Can I just copy the source mnist.py into my local or do i need to clone master and reinstall from source?
I tried just copying mnist.py and it worked just fine!
Thanks!
Thanks @Gokkulnath for the help!
Most helpful comment
Thanks @Gokkulnath for the help!