Fairseq: BART training time

Created on 19 Dec 2019  路  1Comment  路  Source: pytorch/fairseq

May I know how much time BART pre-training took in which GPU configuration? I can see in the paper its written 500K steps with batch size 8k but I want to know the time it took. Many thanks.

question

Most helpful comment

The time can depend on the type and numbers of gpus. We trained for around 11-12 days on 256 gpus.

>All comments

The time can depend on the type and numbers of gpus. We trained for around 11-12 days on 256 gpus.

Was this page helpful?
0 / 5 - 0 ratings