Horovod: 'float' object has no attribute 'detach' in pytorch imagenet example.

Created on 22 Feb 2019  路  1Comment  路  Source: horovod/horovod

I ran the pytorch imagenet example but got an error that float number don't have detach() method. It seems that loss.item() lead to the float number, but I don't know how to fix that in horovod framework.

Can anyone help me? Thanks a lot!

mpirun -np 4 \
  -H localhost:4 \
  -bind-to none -map-by slot \
  -x NCCL_DEBUG=INFO -x LD_LIBRARY_PATH -x PATH \
  -mca pml ob1 -mca btl ^openib \
  python main_hvd.py --train-dir /datasets/ILSVRC2012/images/train --val-dir /datasets/ILSVRC2012/images/val
Train Epoch     #1:   0%|          | 0/10010 [00:00<?, ?it/s]Traceback (most recent call last):
  File "main_hvd.py", line 272, in <module>
    train(epoch)
  File "main_hvd.py", line 179, in train
    train_loss.update(loss.item())
  File "main_hvd.py", line 263, in update
    self.sum += hvd.allreduce(val.detach().cpu(), name=self.name)
AttributeError: 'float' object has no attribute 'detach'

My environment is:

  • pytorch==0.4.1
  • horovod==0.16.0
bug

Most helpful comment

Sorry about that, it's a bug. I've submitted #853 with a fix, meanwhile, you can replace train_loss.update(loss.item()) with train_loss.update(loss).

>All comments

Sorry about that, it's a bug. I've submitted #853 with a fix, meanwhile, you can replace train_loss.update(loss.item()) with train_loss.update(loss).

Was this page helpful?
0 / 5 - 0 ratings