transformers version: bert-base-uncasedModel I am using (Bert,):
The problem arises when using:
`
for _ in range(params.eval_steps):
# fetch the next evaluation batch
batch_data, batch_tags = next(data_iterator)
batch_masks = batch_data.gt(0)
loss, _ = model(batch_data, token_type_ids=None, attention_mask=batch_masks, labels=batch_tags)
if params.n_gpu > 1 and params.multi_gpu:
loss = loss.mean()
loss_avg.update(loss.item())
batch_output = model(batch_data, token_type_ids=None, attention_mask=batch_masks) # shape: (batch_size, max_len, num_labels)
batch_output = batch_output.detach().cpu().numpy()
batch_tags = batch_tags.to('cpu').numpy()
pred_tags.extend([idx2tag.get(idx) for indices in np.argmax(batch_output, axis=2) for idx in indices])
true_tags.extend([idx2tag.get(idx) for indices in batch_tags for idx in indices])
assert len(pred_tags) == len(true_tags)`
The tasks I am working on is:
Steps to reproduce the behavior:
transformers library instead of pytorch-pretrained-bertTraceback (most recent call last):
File "train.py", line 219, in <module>
train_and_evaluate(model, train_data, val_data, optimizer, scheduler, params, args.model_dir, args.restore_file)
File "train.py", line 106, in train_and_evaluate
train_metrics = evaluate(model, train_data_iterator, params, mark='Train')
File "/content/BERT-keyphrase-extraction/evaluate.py", line 54, in evaluate
batch_output = batch_output.detach().cpu().numpy()
AttributeError: 'tuple' object has no attribute 'detach'
The model should continue training after the first epoch
As the error says, you've applied a .detach() method to a model output, which are always tuples. You can check the documentation here.
You probably want the first output of your model so change this line:
batch_output = model(batch_data, token_type_ids=None, attention_mask=batch_masks)
to
batch_output = model(batch_data, token_type_ids=None, attention_mask=batch_masks)[0]
Most helpful comment
As the error says, you've applied a
.detach()method to a model output, which are always tuples. You can check the documentation here.You probably want the first output of your model so change this line:
to