Transformers: RAG Retriever (NameError: name 'load_dataset' is not defined in retrieval_rag.py)

Created on 28 Sep 2020  路  2Comments  路  Source: huggingface/transformers

Environment info

  • transformers version: 3.3.0
  • Platform: Linux-4.19.0-11-cloud-amd64-x86_64-with-debian-10.6
  • Python version: 3.7.3
  • PyTorch version (GPU?): 1.6.0+cpu (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Using GPU in script?: NO
  • Using distributed or parallel set-up in script?: NO

Who can help

@sshleifer
RAG model is not on the list, but this is summarization related
-->

Information

Model I am using RAG

The problem arises when using:

  • [ +] the official example scripts: (give details below)
    ``` python from transformers import RagTokenizer, RagRetriever, RagSequenceForGeneration
    import torch

tokenizer = RagTokenizer.from_pretrained("facebook/rag-sequence-nq")
retriever = RagRetriever.from_pretrained("facebook/rag-sequence-nq", index_name="exact", use_dummy_dataset=True)

initialize with RagRetriever to do everything in one forward call

model = RagSequenceForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever)

The tasks I am working on is:
Model coudln't load, didn't perform any task

## To reproduce

Steps to reproduce the behavior:

1. run the code
``` python from transformers import RagTokenizer, RagRetriever, RagSequenceForGeneration
import torch

tokenizer = RagTokenizer.from_pretrained("facebook/rag-sequence-nq")
retriever = RagRetriever.from_pretrained("facebook/rag-sequence-nq", index_name="exact", use_dummy_dataset=True)
# initialize with RagRetriever to do everything in one forward call
model = RagSequenceForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever) 

Expected behavior

Raise a NameError, load_dataset is not defined.

NameError                                 Traceback (most recent call last)
<ipython-input-6-752205d4a1c8> in <module>
      3 
      4 tokenizer = RagTokenizer.from_pretrained("facebook/rag-sequence-nq")
----> 5 retriever = RagRetriever.from_pretrained("facebook/rag-sequence-nq", index_name="exact", use_dummy_dataset=True)
      6 # initialize with RagRetriever to do everything in one forward call
      7 model = RagSequenceForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever)

/mnt/disks/nlp/env_nlp_main/lib/python3.7/site-packages/transformers/retrieval_rag.py in from_pretrained(cls, retriever_name_or_path, **kwargs)
    307         generator_tokenizer = rag_tokenizer.generator
    308         return cls(
--> 309             config, question_encoder_tokenizer=question_encoder_tokenizer, generator_tokenizer=generator_tokenizer
    310         )
    311 

/mnt/disks/nlp/env_nlp_main/lib/python3.7/site-packages/transformers/retrieval_rag.py in __init__(self, config, question_encoder_tokenizer, generator_tokenizer)
    287                 config.retrieval_vector_size,
    288                 config.index_path,
--> 289                 config.use_dummy_dataset,
    290             )
    291         )

/mnt/disks/nlp/env_nlp_main/lib/python3.7/site-packages/transformers/retrieval_rag.py in __init__(self, dataset_name, dataset_split, index_name, vector_size, index_path, use_dummy_dataset)
    218 
    219         logger.info("Loading passages from {}".format(self.dataset_name))
--> 220         self.dataset = load_dataset(
    221             self.dataset_name, with_index=False, split=self.dataset_split, dummy=self.use_dummy_dataset
    222         )

NameError: name 'load_dataset' is not defined
wontfix

Most helpful comment

Try with pip install transformers datasets faiss-cpu psutil (or see the requirements.txt file).

Had the same issue and it fixed it for me.

All 2 comments

Try with pip install transformers datasets faiss-cpu psutil (or see the requirements.txt file).

Had the same issue and it fixed it for me.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings