Fairseq: Receiving "_th_index_select not supported on CPUType for Half" error. your help is highly appreciated.

Created on 27 Oct 2019 · 5Comments · Source: pytorch/fairseq

I am trying to finetune on GLUE tasks based on the instructions given in the link https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.glue.md . After completing steps 1 and 2, I am receiving"RuntimeError: _th_index_select not supported on CPUType for Half" error while I run the command for the third step (Fine-tuning on GLUE task).
I am using the below python version(actually, conda virtual environment) and GPU architecture:
Python 3.6.5
torch 1.2.0+cu92
Nvidia TITAN Xp (12 GB Memory)
Cuda Driver Version: 390.116

I would highly appreciate your help.

Actual Error:

| model roberta_large, criterion SentencePredictionCriterion
| num. model params: 356462683 (num. trained: 356462683)
| training on 1 GPUs
| max tokens per GPU = 4400 and max sentences per GPU = 16
| no existing checkpoint found /home/user/Desktop/benchmarks/code/external/fairseq/glue_data/RTE/model/model.pt
| loading train data for epoch 0
| loaded 2490 examples from: RTE-bin/input0/train
| loaded 2490 examples from: RTE-bin/input1/train
| loaded 2490 examples from: RTE-bin/label/train
| Loaded train with #samples: 2490
| epoch 001: 0%| | 0/156 [00:00 File "train.py", line 343, in
cli_main()
File "train.py", line 339, in cli_main
main(args)
File "train.py", line 92, in main
train(args, trainer, task, epoch_itr)
File "train.py", line 133, in train
log_output = trainer.train_step(samples)
File "/home/user/Desktop/mosharaf/bitbucket/benchmarks_negation/benchmarks_negation/code/external/fairseq/fairseq/trainer.py", line 342, in train_step
raise e
File "/home/user/Desktop/mosharaf/bitbucket/benchmarks_negation/benchmarks_negation/code/external/fairseq/fairseq/trainer.py", line 306, in train_step
ignore_grad
File "/home/user/Desktop/mosharaf/bitbucket/benchmarks_negation/benchmarks_negation/code/external/fairseq/fairseq/tasks/fairseq_task.py", line 246, in train_step
loss, sample_size, logging_output = criterion(model, sample)
File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(input, *kwargs)
File "/home/user/Desktop/mosharaf/bitbucket/benchmarks_negation/benchmarks_negation/code/external/fairseq/fairseq/criterions/sentence_prediction.py", line 41, in forward
classification_head_name='sentence_classification_head',
File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(input, *kwargs)
File "/home/user/Desktop/mosharaf/bitbucket/benchmarks_negation/benchmarks_negation/code/external/fairseq/fairseq/models/roberta/model.py", line 99, in forward
x, extra = self.decoder(src_tokens, features_only, return_all_hiddens, *kwargs)
File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(input, *kwargs)
File "/home/user/Desktop/mosharaf/bitbucket/benchmarks_negation/benchmarks_negation/code/external/fairseq/fairseq/models/roberta/model.py", line 287, in forward
x, extra = self.extract_features(src_tokens, return_all_hiddens)
File "/home/user/Desktop/mosharaf/bitbucket/benchmarks_negation/benchmarks_negation/code/external/fairseq/fairseq/models/roberta/model.py", line 295, in extract_features
last_state_only=not return_all_hiddens,
File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(input, *kwargs)
File "/home/user/Desktop/mosharaf/bitbucket/benchmarks_negation/benchmarks_negation/code/external/fairseq/fairseq/modules/transformer_sentence_encoder.py", line 183, in forward
x = self.embed_tokens(tokens)
File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(input, **kwargs)
File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/home/user/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 1467, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: _th_index_select not supported on CPUType for Half

Source

mosharafhossain

Most helpful comment

In my case, it had to do with a badly fitting CUDA version.
I downgraded to a PyTorch version (1.2) that matched the CUDA version (10.0) I had installed, and it worked

YuvalPeleg on 25 Nov 2019

👍2 ❤1

All 5 comments

Please check whether the model is loaded on CPU? Or are you using an Volta GPu or one which has tensor cores to perform fp16 computations.

gvskalyan on 29 Oct 2019

I have the same issue, any updates?

nabi-rezvani on 24 Nov 2019

In my case, it had to do with a badly fitting CUDA version.
I downgraded to a PyTorch version (1.2) that matched the CUDA version (10.0) I had installed, and it worked

YuvalPeleg on 25 Nov 2019

👍2 ❤1

Hi, is there any updates now ? I also received this issue when I doing a sample phase. But it worked with fairseq_generate cli. Thx guys, be appreciate

Hayao41 on 13 Apr 2020

I've encountered this type of error.
This was because, my dataset is of type np.float16.
converting my dataset from np.float16 to np.float32 solved the problem
Cheers: