Flair: Can we train using multiple CPUs (parallellization) since Flair is implemented with Pytorch?

Created on 30 Jul 2019 · 7Comments · Source: flairNLP/flair

question wontfix

Source

Wickky

Most helpful comment

What about multiple GPUs? PyTorch supports that, so how much is needed to make Flair support it?

BernierCR on 3 Aug 2019

👍2

All 7 comments

I'm pretty sure PyTorch does CPU parallelization by default.

severinsimmler on 30 Jul 2019

@severinsimmler Thank you for your reply. Is there any way to check for making sure of CPU parallelization while training in Flair?

Wickky on 31 Jul 2019

You could start training and check with e.g. htop if more than one CPU is used.

severinsimmler on 1 Aug 2019

What about multiple GPUs? PyTorch supports that, so how much is needed to make Flair support it?

BernierCR on 3 Aug 2019

👍2

There are several ways of adding multi-GPU support. One is with the built-in DataParallel classes of PyTorch (see #848), the other is using Horovod (see #859). We are still evaluating which is the best way to go here, but currently we tend to favor horovod since this is less invasive to the rest of the code and has more functionality.

Automatic mixed precision was also added recently (see #934) and is in fact already merged to master. This gives huge speedups in training large language models on a single GPU.

alanakbik on 4 Aug 2019

This is about Flair language model or other models (e.g. classification)? Any updates?

djstrong on 14 Nov 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.