Incubator-mxnet: [Feature Request] Cache the CUDNN convolution optimization result

Created on 16 Apr 2018 · 8Comments · Source: apache/incubator-mxnet

On every script run, I get the CUDNN convolution optimization algorithm running. This can take a few seconds, I wonder if we could cache the result locally based on a hash of MXNet + CUDA + CUDNN version for each device ID (or whatever could cause a change in algorithm selection) ?

[20:48:19] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable

CUDA Feature request

Source

ThomasDelteil

👍12

Most helpful comment

My team has an implementation of this in a fork. We'll try and contribute it back, but no promises on a timeline.

KellenSunderland on 19 Sep 2018

👍7

All 8 comments

@eric-haibin-lin : Please label : CUDA, Feature

spidyDev on 19 Apr 2018

Just want to +1. I've talked to quite a few MXNet users who could really use this functionality.

KellenSunderland on 14 Jun 2018

+1
Any news?

1frey on 19 Sep 2018

My team has an implementation of this in a fork. We'll try and contribute it back, but no promises on a timeline.

KellenSunderland on 19 Sep 2018

👍7

Any updates @KellenSunderland? That feature sounds very useful.

AustinDoolittle on 21 Nov 2018

@KellenSunderland would love to know more, this is very relevant to us.

zeryx on 8 Mar 2019

+1
2 years has past..

Neutron3529 on 1 May 2020

May be fixed as part of cudnn 8 integration? https://docs.nvidia.com/deeplearning/sdk/cudnn-api/index.html

cc @DickJC123

leezu on 1 May 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Use R, how to manually confficients predict y^, make the result equal to the function `predict`?

GuilongZh · 3Comments

Is there a simple way to make two similar networks share same weights?

xzqjack · 3Comments

Mxnet : test and validation accuracy during training ?

Shiro-LK · 3Comments

Training Accuracy and Validation Accuracy problem

WangcsShuai · 3Comments

train.rec test.rec for cifar100

ranti-iitg · 3Comments