transformers version: 3.3.1
@patrickvonplaten
Model I am using (Bert, XLNet ...): RAG
The problem arises when using:
The tasks I am working on is:
Following usage of token and sequence models should not be allowed, it may give unintended result in forward pass-
# RagSequenceForGeneration with "facebook/rag-token-nq"
model = RagSequenceForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever)
# RagTokenForGeneration with "facebook/rag-sequence-nq"
model = RagTokenForGeneration.from_pretrained("facebook/rag-sequence-nq", retriever=retriever)
Also please correct example at https://huggingface.co/transformers/master/model_doc/rag.html#ragsequenceforgeneration
Above usage should throw exception because both the models are incompatible with the each other.
The model weights are actually 1-1 compatible with each other, so I see no reason why we should throw an exception here.
Hi Patrick, I also believe there are typos regarding the examples :
On "sequence" based : https://huggingface.co/facebook/rag-sequence-nq , the examples use "token" arguments e.g.
retriever = RagRetriever.from_pretrained("facebook/rag-token-nq", index_name="exact", use_dummy_dataset=True)
model = RagSequenceForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever)
@patrickvonplaten yes I think agree with you. I am closing this.
@patrickvonplaten
I am seeing very weird behaviour. Various RAG generator and model combination giving me very different output.
I am not able to understand why?
Check output of generators for "What is capital of Germany?" -
!pip install git+https://github.com/huggingface/transformers.git
!pip install datasets
!pip install faiss-cpu
!pip install torch torchvision
from transformers import RagTokenizer, RagRetriever, RagTokenForGeneration, RagSequenceForGeneration
import torch
import faiss
tokenizer = RagTokenizer.from_pretrained("facebook/rag-sequence-nq")
retriever = RagRetriever.from_pretrained("facebook/rag-sequence-nq", index_name="exact", use_dummy_dataset=True)
input_dict = tokenizer.prepare_seq2seq_batch("What is capital of Germany?", return_tensors="pt")
input_ids = input_dict["input_ids"]
# RagTokenForGeneration with "facebook/rag-token-nq"
model = RagTokenForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever)
generated_ids = model.generate(input_ids=input_ids)
generated_string = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print("Result of model = ", generated_string)
# RagSequenceForGeneration with "facebook/rag-sequence-nq"
model = RagSequenceForGeneration.from_pretrained("facebook/rag-sequence-nq", retriever=retriever)
generated_ids = model.generate(input_ids=input_ids)
generated_string = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print("Result of model = ", generated_string)
# RagSequenceForGeneration with "facebook/rag-token-nq"
model = RagSequenceForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever)
generated_ids = model.generate(input_ids=input_ids)
generated_string = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print("Result of model = ", generated_string)
# RagTokenForGeneration with "facebook/rag-sequence-nq"
model = RagTokenForGeneration.from_pretrained("facebook/rag-sequence-nq", retriever=retriever)
generated_ids = model.generate(input_ids=input_ids)
generated_string = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print("Result of model = ", generated_string)
Output of above run is (it is consistent behaviour) -
Result of model = [' german capital']
Result of model = ['']
Result of model = [' munich']
Result of model = [' germany']
Hi Patrick, I also believe there are typos regarding the examples :
On "sequence" based : https://huggingface.co/facebook/rag-sequence-nq , the examples use "token" arguments e.g.
retriever = RagRetriever.from_pretrained("facebook/rag-token-nq", index_name="exact", use_dummy_dataset=True) model = RagSequenceForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever)Should be fixed - thanks :-)
https://github.com/huggingface/transformers/blob/master/model_cards/facebook/rag-sequence-nq/README.md
@patrickvonplaten
I am seeing very weird behaviour. Various RAG generator and model combination giving me very different output.
I am not able to understand why?Check output of generators for "What is capital of Germany?" -
!pip install git+https://github.com/huggingface/transformers.git !pip install datasets !pip install faiss-cpu !pip install torch torchvision from transformers import RagTokenizer, RagRetriever, RagTokenForGeneration, RagSequenceForGeneration import torch import faiss tokenizer = RagTokenizer.from_pretrained("facebook/rag-sequence-nq") retriever = RagRetriever.from_pretrained("facebook/rag-sequence-nq", index_name="exact", use_dummy_dataset=True) input_dict = tokenizer.prepare_seq2seq_batch("What is capital of Germany?", return_tensors="pt") input_ids = input_dict["input_ids"] # RagTokenForGeneration with "facebook/rag-token-nq" model = RagTokenForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever) generated_ids = model.generate(input_ids=input_ids) generated_string = tokenizer.batch_decode(generated_ids, skip_special_tokens=True) print("Result of model = ", generated_string) # RagSequenceForGeneration with "facebook/rag-sequence-nq" model = RagSequenceForGeneration.from_pretrained("facebook/rag-sequence-nq", retriever=retriever) generated_ids = model.generate(input_ids=input_ids) generated_string = tokenizer.batch_decode(generated_ids, skip_special_tokens=True) print("Result of model = ", generated_string) # RagSequenceForGeneration with "facebook/rag-token-nq" model = RagSequenceForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever) generated_ids = model.generate(input_ids=input_ids) generated_string = tokenizer.batch_decode(generated_ids, skip_special_tokens=True) print("Result of model = ", generated_string) # RagTokenForGeneration with "facebook/rag-sequence-nq" model = RagTokenForGeneration.from_pretrained("facebook/rag-sequence-nq", retriever=retriever) generated_ids = model.generate(input_ids=input_ids) generated_string = tokenizer.batch_decode(generated_ids, skip_special_tokens=True) print("Result of model = ", generated_string)Output of above run is (it is consistent behaviour) -
Result of model = [' german capital'] Result of model = [''] Result of model = [' munich'] Result of model = [' germany']
Hey @lalitpagaria , the models are different in generating the answers - the results are not unexpected :-) If you take a closer look into the code you can see that both models expect the exact same weights, but have different generate() functions
Thanks @patrickvonplaten
I will play with few parameters of RegConfig.