Incubator-mxnet: grouped convolutions seemingly not very efficiently

Created on 14 Jun 2017 · 7Comments · Source: apache/incubator-mxnet

I implemented mobilenet but forward time looks slow in GPU. I use grouped conv to implement depthwise conv, set group equal to number of feature maps. The computation reduced by depthwise conv dosn't make inference time decrease. The mobilenet impemented by TF seem to be much fasterhttps://github.com/Zehaos/MobileNet,

Source

onealwj

👍4

Most helpful comment

For the record, we now have an efficient depthwise conv implementation when you set num_filters to num_groups

piiswrong on 25 Nov 2017

❤5

All 7 comments

I encounter the same problem, will MxNet add a more efficient group convolutions implementation ?

aktiger on 14 Jun 2017

I encounter the same problem, will MxNet add a more efficient group convolutions implementation ? I found tf can run mobilenet 0.059s/image in cpu !!!

KeyKy on 15 Jun 2017

mark

ysh329 on 5 Jul 2017

I encountered the same problem. The implementation of grouped convolutions is slow.

zdwong on 23 Jul 2017

Generally speaking, group implementation is based one or more GEMM ( matrix multiply, one implementation way of convolution ). For example, if group_num = 1 (default situation), then the total execution times of GEMM equals one. However, when group_num = 2, then the execution times of GEMM equals two.

I don't read MXNet code of low API, so Is MXNet as I said above, which shoud read low-level codes. So if you want to implement a fast depth-wise conv, maybe you should write a low-level depth-wise operator, but not python codes.

ysh329 on 24 Jul 2017

This issue is closed due to lack of activity in the last 90 days. Feel free to ping me to reopen if this is still an active issue. Thanks!
Also, do please check out our forum (and Chinese version) for general "how-to" questions.