fastai code runs very slowly on a CPU

Created on 24 Oct 2018  路  4Comments  路  Source: fastai/fastai

Describe the bug

fastai code runs very slowly on a CPU.

To Reproduce

model = fastai.vision.models.WideResNet(num_groups=3,
                                        N=3,
                                        num_classes=10,
                                        k=6,
                                        drop_p=0.)
path = fastai.untar_data(fastai.URLs.MNIST_TINY)
data = fastai.vision.ImageDataBunch.from_folder(path, bs=10)
if data.device.type == 'cpu':
    learn = fastai.Learner(data, model, metrics=fastai.accuracy)
else: # GPU:
    learn = fastai.Learner(data, model, metrics=fastai.accuracy).to_fp16()
learn.fit_one_cycle(1, 3e-3, wd=0.4, div_factor=10, pct_start=0.5)

Expected behavior

I would expect that I could handle a dataset with at least a few hundred patterns, but I have to trim this down to a few dozen (or less) to be able to get something I can test on a CPU.

Am I doing something wrong? Or is the code on the CPU just really slow (much more slow than keras, for example)? Is this a torch issue?

Most helpful comment

Could you show code and timings for keras and fastai for this example please? I haven't seen any documented examples of pytorch being slower than keras, so I'm sure the pytorch team would be very interested to see if that's the case.

Also, when providing timings, also mention what BLAS you're using for each library, what keras backend you're using, and what CPU you have. On CPU the main source of speed differences tends to be from the BLAS lib you've linked with.

There may also be a problem with num OPENMP threads being set incorrectly leading to only one CPU getting used.

Anyway - certainly interested in digging in to this question with you! :)

All 4 comments

Could you show code and timings for keras and fastai for this example please? I haven't seen any documented examples of pytorch being slower than keras, so I'm sure the pytorch team would be very interested to see if that's the case.

Also, when providing timings, also mention what BLAS you're using for each library, what keras backend you're using, and what CPU you have. On CPU the main source of speed differences tends to be from the BLAS lib you've linked with.

There may also be a problem with num OPENMP threads being set incorrectly leading to only one CPU getting used.

Anyway - certainly interested in digging in to this question with you! :)

@dsblank any update on this?

I'm getting together some reproducible code, and data about the environments I have tested. Should have an update in a few days.

as this is not being followed up on, I'm closing this. If you have the problem still and have code to help us reproduce it please re-open.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bpisaacoff picture bpisaacoff  路  3Comments

mlsmall picture mlsmall  路  5Comments

rfernand2 picture rfernand2  路  4Comments

xoelop picture xoelop  路  3Comments

ravikanth076 picture ravikanth076  路  3Comments