transformers version: 3.3.1@ola13 @mfuntowicz
Hi- I am trying to get the RAG running, however I am getting the error when I follow the instructions here: https://huggingface.co/facebook/rag-token-nq
Particularly, the error message is as follows:
TypeError Traceback (most recent call last)
<ipython-input-7-35cd6a2213c0> in <module>
1 from transformers import AutoTokenizer, AutoModelWithLMHead
2
----> 3 tokenizer = AutoTokenizer.from_pretrained("facebook/rag-token-nq")
~/src/transformers/src/transformers/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
258 return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
259 else:
--> 260 return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
261
262 raise ValueError(
~/src/transformers/src/transformers/tokenization_rag.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
61 print(config.generator)
62 print("***")
---> 63 generator = AutoTokenizer.from_pretrained(generator_path, config=config.generator)
64 return cls(question_encoder=question_encoder, generator=generator)
65
~/src/transformers/src/transformers/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
258 return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
259 else:
--> 260 return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
261
262 raise ValueError(
~/src/transformers/src/transformers/tokenization_utils_base.py in from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
1557
1558 return cls._from_pretrained(
-> 1559 resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs
1560 )
1561
~/src/transformers/src/transformers/tokenization_utils_base.py in _from_pretrained(cls, resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs)
1648
1649 # Add supplementary tokens.
-> 1650 special_tokens = tokenizer.all_special_tokens
1651 if added_tokens_file is not None:
1652 with open(added_tokens_file, encoding="utf-8") as added_tokens_handle:
~/src/transformers/src/transformers/tokenization_utils_base.py in all_special_tokens(self)
1026 Convert tokens of :obj:`tokenizers.AddedToken` type to string.
1027 """
-> 1028 all_toks = [str(s) for s in self.all_special_tokens_extended]
1029 return all_toks
1030
~/src/transformers/src/transformers/tokenization_utils_base.py in all_special_tokens_extended(self)
1046 logger.info(all_toks)
1047 print(all_toks)
-> 1048 all_toks = list(OrderedDict.fromkeys(all_toks))
1049 return all_toks
1050
TypeError: unhashable type: 'dict'
all_toks variable looks as follows. Obviously, it is a dictionary and OrderedDict.fromkeys doesn't like it.
[{'content': '<s>', 'single_word': False, 'lstrip': False, 'rstrip': False, 'normalized': True}, {'content': '</s>', 'single_word': False, 'lstrip': False, 'rstrip': False, 'normalized': True}, {'content': '<unk>', 'single_word': False, 'lstrip': False, 'rstrip': False, 'normalized': True}, {'content': '</s>', 'single_word': False, 'lstrip': False, 'rstrip': False, 'normalized': True}, {'content': '<pad>', 'single_word': False, 'lstrip': False, 'rstrip': False, 'normalized': True}, {'content': '<s>', 'single_word': False, 'lstrip': False, 'rstrip': False, 'normalized': True}, {'content': '<mask>', 'single_word': False, 'lstrip': True, 'rstrip': False, 'normalized': True}]
I will be digging deeper, hoping that I am doing an obvious mistake.
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("facebook/rag-token-nq")
It should load the tokenizer!
Thank you.
When I run the examples from:
https://huggingface.co/transformers/model_doc/rag.html
I get exactly the same error:

Hey @dzorlu - thanks for your error, I will take a look tomorrow!
Hey @dzorlu - thanks for your error, I will take a look tomorrow!
thanks @patrickvonplaten . Appreciate all the hard work :+1:
Should be solved now - let me know if you still experience problems @dzorlu
Thank you!
Most helpful comment
Should be solved now - let me know if you still experience problems @dzorlu