When it is required to do a forward propagation at first, and then do another one in the same process,
if gpu memory malloced by the first propagation can not be released, the second one will not have enough memory to use.
@DuCheng2018 Thank you for the feature, request.
@mxnet-label-bot add [Feature Request]
I am getting multiple(> 5 times) requests similar to this one, I think we need a formal C API as well as python API so user can explicitly release gpu memory pool without waiting for program to exit.
+1, for some application it is necessary to release gpu memory.
+1, yes, this is a function needed badly
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
Hi, there.
Thank @vladoovtcharov for adding this API for GPU memory release.
The API is mx.gpu(0).empty_cache().
Firstly, we should set the environment variable MXNET_GPU_MEM_POOL_TYPE to use GPU memory pool.
Here is an example:
# MXNET_GPU_MEM_POOL_TYPE=Round python
Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet as mx
>>> a = mx.nd.zeros((10000, 30000), ctx=mx.gpu(0)) + 1
[14:56:53] src/storage/storage.cc:110: Using GPUPooledRoundedStorageManager.
>>> del a
>>> mx.gpu(0).empty_cache()
In my test, the most GPU memory will be released after calling this API, although there is still around 600Mib memory cache.
This API only releases the cache in the GPU memory pool. It doesn't release the NDArray cache held by CachedOp
+1
Closing this issue as #14252 is merged, please feel to ping me if you have other thoughts.
Most helpful comment
This API only releases the cache in the GPU memory pool. It doesn't release the NDArray cache held by CachedOp