Fairseq: Error during inference of model trained on fp16

Created on 19 Feb 2019  路  3Comments  路  Source: pytorch/fairseq

Hi,
I trained a a translation model from the transformer architecture on fp16 on turing gpu. The result I get is a model which is half in size than the original fp32 model I trained which is good. But now I want to infer both the models on CPU only. When I tried to infer the fp16 model without passing --fp16 argument during inference I get same inference time for both models. But when I try to pass --fp16 flag, I am getting error as

  File "/home/fairseq_translation/fairseq/fairseq/sequence_generator.py", line 148, in generate
    return self._generate(encoder_input, beam_size, maxlen, prefix_tokens)
  File "/home/fairseq_translation/fairseq/fairseq/sequence_generator.py", line 174, in _generate
    encoder_out = model.encoder(**encoder_input)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/fairseq_translation/fairseq/fairseq/models/transformer.py", line 314, in forward
    x = self.embed_scale * self.embed_tokens(src_tokens)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/sparse.py", line 118, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 1454, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: _th_index_select is not implemented for type torch.HalfTensor

Why am I getting this error , did I miss something ? Please help.
Thanks

Most helpful comment

Hi, @myleott Thanks for the reply. Is this a limitation with libraries like PyTorch , Tensorflow or with the CPU architecture in itself. As I see this comment it confuses me whether it is limitation of PyTorch as you said or CPUs.

All 3 comments

I don't think half precision computation is supported on CPU. @myleott?

That's correct, you have to choose FP16 or CPU. This is a limitation of PyTorch.

Hi, @myleott Thanks for the reply. Is this a limitation with libraries like PyTorch , Tensorflow or with the CPU architecture in itself. As I see this comment it confuses me whether it is limitation of PyTorch as you said or CPUs.

Was this page helpful?
0 / 5 - 0 ratings