Datasets: Dataset wikipedia cannot be loaded at version 1.0.0, only: 0.0.3

Created on 12 Apr 2020  路  4Comments  路  Source: tensorflow/datasets

tfds nightly, downloaded the wikipedia dataset using:
python -m tensorflow_datasets.scripts.download_and_prepare --datasets=wikipedia/20190301.en
Now trying to access it using
ds, info = tfds.load('wikipedia/20190301.en:1.0.0', download=False, shuffle_files=True, with_info=True)

But receiving the error

ERROR:absl:Failed to construct dataset wikipedia
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
 in 
      1 # Construct a tf.data.Dataset
----> 2 ds, info = tfds.load('wikipedia/20190301.en:1.0.0', download=False, shuffle_files=True, with_info=True)

~\Anaconda3\envs\docBert\lib\site-packages\tensorflow_datasets\core\api_utils.py in disallow_positional_args_dec(fn, instance, args, kwargs)
     50     ismethod = instance is not None
     51     _check_no_positional(fn, args, ismethod, allowed=allowed)
---> 52     _check_required(fn, kwargs)
     53     return fn(*args, **kwargs)
     54 

~\Anaconda3\envs\docBert\lib\site-packages\tensorflow_datasets\core\registered.py in load(name, split, data_dir, batch_size, in_memory, shuffle_files, download, as_supervised, decoders, with_info, builder_kwargs, download_and_prepare_kwargs, as_dataset_kwargs, try_gcs)
    295       [the guide](https://github.com/tensorflow/datasets/tree/master/docs/decode.md)
    296       for more info.
--> 297     read_config: `tfds.ReadConfig`, Additional options to configure the
    298       input pipeline (e.g. seed, num parallel reads,...).
    299     with_info: `bool`, if True, tfds.load will return the tuple

~\Anaconda3\envs\docBert\lib\site-packages\tensorflow_datasets\core\registered.py in builder(name, **builder_init_kwargs)
    167     elif class_dict.get("IN_DEVELOPMENT"):
    168       _IN_DEVELOPMENT_REGISTRY[name] = builder_cls
--> 169     else:
    170       _DATASET_REGISTRY[name] = builder_cls
    171     return builder_cls

~\Anaconda3\envs\docBert\lib\site-packages\tensorflow_datasets\core\api_utils.py in disallow_positional_args_dec(fn, instance, args, kwargs)
     50     ismethod = instance is not None
     51     _check_no_positional(fn, args, ismethod, allowed=allowed)
---> 52     _check_required(fn, kwargs)
     53     return fn(*args, **kwargs)
     54 

~\Anaconda3\envs\docBert\lib\site-packages\tensorflow_datasets\core\dataset_builder.py in __init__(self, data_dir, config, version)
    178         `builder_config`s will have their own subdirectories and versions.
    179       version: `str`. Optional version at which to load the dataset. An error is
--> 180         raised if specified version cannot be satisfied. Eg: '1.2.3', '1.2.*'.
    181         The special value "experimental_latest" will use the highest version,
    182         even if not default. This is not recommended unless you know what you

~\Anaconda3\envs\docBert\lib\site-packages\tensorflow_datasets\core\dataset_builder.py in _pick_version(self, requested_version)
    209   def __setstate__(self, state):
    210     self.__init__(**state)
--> 211 
    212   @utils.memoized_property
    213   def canonical_version(self):

AssertionError: Dataset wikipedia cannot be loaded at version 1.0.0, only: 0.0.3.
help

Most helpful comment

Try after reinstalling tfds-nightly, currently wikipediadataset have version 1.0.0 see and according to your stacktrace it says you to use 0.0.3.
Also use latest config only 20200301

Edit : It works fine with 20190301 config too see this colab, but its older config so it is recommended to use latest only, also I think after reinstalling tfds-nightly or simply cloning this repo it works fine for you for 20190301

All 4 comments

@dvirginz you are providing wrong config replace 20190301 with 20200301 see here

But this is what tfds downloaded tensorflow_datasets\wikipedia\20190301.en\1.0.0

Try after reinstalling tfds-nightly, currently wikipediadataset have version 1.0.0 see and according to your stacktrace it says you to use 0.0.3.
Also use latest config only 20200301

Edit : It works fine with 20190301 config too see this colab, but its older config so it is recommended to use latest only, also I think after reinstalling tfds-nightly or simply cloning this repo it works fine for you for 20190301

@dvirginz it seems that your issue is solved.
So, please close the issue

Was this page helpful?
0 / 5 - 0 ratings