Datasets: Has ReadInstruction been removed?

Created on 8 Oct 2019  路  5Comments  路  Source: tensorflow/datasets

Short description
I want to split the test portion of a TensorFlow dataset in half so I can use half as test data and half as validation data. I was trying to follow the examples in here and it seems that ReadInstruction is no longer in tensorflow_datasets.

Environment information

  • Operating System: Windows
  • Python version: 3.7.4
  • tensorflow: Version: 2.0.0
  • tensorflow-datasets: Version: 1.2.0

Reproduction instructions
Just try to create a ReadInstruction object.

Expected behavior
I expected the documentation above to be up-to-date and accurate.

bug

Most helpful comment

Thanks for reporting. Indeed we forgot to exposed the object on the public API.
I think you should try to use the string version instead: train_ds, test_ds = tfds.load('mnist:3.*.*', split=['train[50%:]', 'train[:50%]'])

All 5 comments

Thanks for reporting. Indeed we forgot to exposed the object on the public API.
I think you should try to use the string version instead: train_ds, test_ds = tfds.load('mnist:3.*.*', split=['train[50%:]', 'train[:50%]'])

Ah. Thanks for the confirmation.

I think you should try to use the string version instead...

I tried that and an error was reported saying that only "train", "test", and some other category I can't remember off the top of my head were available.

In other words, I believe that the string version is also... not available on the public API.

It means that you're probably using a legacy dataset. Have you tried the legacy API ? https://www.tensorflow.org/datasets/splits#legacy_slicing_api

Ah yes. I see that the documentation is using a data source (imdb_reviews/subwords8k) that is flagged for "retirement" and does not support the S3 slicing API.

For anyone else stumbling on this threat, you can test this by running,

builder = tfds.builder('imdb_reviews/subwords8k')
builder.version.implements(tfds.core.Experiment.S3)

which outputs False.

Thanks.

Should be fixed with https://github.com/tensorflow/datasets/pull/1064
tfds.core.ReadInstruction is now exposed.

Was this page helpful?
0 / 5 - 0 ratings