File "C:\Users\temp\Aida\aida\agents\bertbot\Bert\bert_intent_classifier_pytorch.py", line 298, in process
logits = self.model(prediction_inputs, token_type_ids=None, attention_mask=prediction_masks)
File "C:\Users\temp\Anaconda3\envs\fresh\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\temp\Anaconda3\envs\fresh\lib\site-packages\transformers\modeling_bert.py", line 897, in forward
head_mask=head_mask)
File "C:\Users\temp\Anaconda3\envs\fresh\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\temp\Anaconda3\envs\fresh\lib\site-packages\transformers\modeling_bert.py", line 624, in forward
embedding_output = self.embeddings(input_ids, position_ids=position_ids, token_type_ids=token_type_ids)
File "C:\Users\temp\Anaconda3\envs\fresh\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\temp\Anaconda3\envs\fresh\lib\site-packages\transformers\modeling_bert.py", line 167, in forward
words_embeddings = self.word_embeddings(input_ids)
File "C:\Users\temp\Anaconda3\envs\fresh\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\temp\Anaconda3\envs\fresh\lib\site-packages\torch\nn\modules\sparse.py", line 114, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "C:\Users\temp\Anaconda3\envs\fresh\lib\site-packages\torch\nn\functional.py", line 1484, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.cuda.IntTensor instead (while checking arguments for embedding)
Hi everyone when I run the line:
outputs = model(input_ids = b_input_ids, attention_mask=b_input_mask, labels=b_labels)
with model defined as,
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=numlabels)
It returns the stated error. However this only happens when I am on my windows computer.
When I run the exact same code with the same python version and libraries it works perfectly fine.
I have the most up to date version of pytorch (1.4) and transformers installed.
Any help would be greatly appreciated
Using the latest version of pytorch and transformers
Model I am using (Bert, XLNet ...): BertForSequenceClassification
Language I am using the model on (English, Chinese ...): English
It is weird that there is a discrepancy between Windows and Linux.
Could you try casting your variables b_input_ids, b_input_mask and b_labels to torch.long?
Are you defining some of your variables on GPU? Does it fail if everything stays on CPU?
I often prototype on Windows and push to Linux for final processing and I've never had this issue. Can you post a minimal working example that I can copy-paste to test?
Ok update I got the error to go away but to do it I had to do some janky fixes that I don't think should be necessary
`
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=numlabels)
model.cuda()
#model = nn.DataParallel(model)
# This variable contains all of the hyperparemeter information our training loop needs
# Parameters:
lr = 2e-5
max_grad_norm = 1.0
num_training_steps = 1000
num_warmup_steps = 100
warmup_proportion = float(num_warmup_steps) / float(num_training_steps) # 0.1
### In Transformers, optimizer and schedules are splitted and instantiated like this:
optimizer = AdamW(model.parameters(), lr=lr, correct_bias=False) # To reproduce BertAdam specific behavior set correct_bias=False
scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=num_warmup_steps, num_training_steps=num_training_steps) # PyTorch scheduler
t = []
# Store our loss and accuracy for plotting
train_loss_set = []
# Number of training epochs (authors recommend between 2 and 4)
epochs = 5 #5:0.96
# trange is a tqdm wrapper around the normal python range
for _ in trange(epochs, desc="Epoch"):
# Training
# Set our model to training mode (as opposed to evaluation mode)
model.train()
# Tracking variables
tr_loss = 0
nb_tr_examples, nb_tr_steps = 0, 0
# Train the data for one epoch
for step, batch in enumerate(train_dataloader):
# Add batch to GPU
batch = tuple(t.to(device) for t in batch)
# Unpack the inputs from our dataloader
b_input_ids, b_input_mask, b_labels = batch
###############Bug fix code####################
b_input_ids = b_input_ids.type(torch.LongTensor)
b_input_mask = b_input_mask.type(torch.LongTensor)
b_labels = b_labels.type(torch.LongTensor)
b_input_ids = b_input_ids.to(device)
b_input_mask = b_input_mask.to(device)
b_labels = b_labels.to(device)
############################################
# Clear out the gradients (by default they accumulate)
optimizer.zero_grad()
# Forward pass
outputs = model(input_ids = b_input_ids, attention_mask=b_input_mask, labels=b_labels)
loss, logits = outputs[:2]
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), max_grad_norm) # Gradient clipping is not in AdamW anymore (so you can use amp without issue)
optimizer.step()
scheduler.step()
`
Very strange
(posted the code I thought would be useful to see let me know if you need to see more)
You're doing .to(device) twice for your data (once in the tuple, once separately). It is hard to reproduce this because we don't have your data, so we don't know how you encode your data. What is example contents of batch to reproduce your issue?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Had similar issue:
following fix from stackoverflow worked.
b_input_ids = torch.tensor(b_input_ids).to(torch.int64)
Having the same issue, funny thing is the whole model worked for training, but while running inference on test data the error automatically showed up
Having the same issue, funny thing is the whole model worked for training, but while running inference on test data the error automatically showed up
Exactly the same issue I am facing. I am using Amazon SageMaker notebook instance
Most helpful comment
Had similar issue:
following fix from stackoverflow worked.
b_input_ids = torch.tensor(b_input_ids).to(torch.int64)
https://stackoverflow.com/questions/56360644/pytorch-runtimeerror-expected-tensor-for-argument-1-indices-to-have-scalar-t