Fairseq: AdaFactor to save GPU memory?

Created on 20 Sep 2018  路  3Comments  路  Source: pytorch/fairseq

Tensor2Tensor has AdaFactor to drastically reduce the GPU memory usage. I believe it would be helpful for FairSeq to have this by default.

enhancement

Most helpful comment

Working on this

All 3 comments

Good idea!

Working on this

Was this page helpful?
0 / 5 - 0 ratings