Keras: `predict` memory leak?

Created on 30 Mar 2018  路  9Comments  路  Source: keras-team/keras

Specs:
Python 3.6.3
keras==2.1.5
tensorflow==1.7.0

It seems there is a memory leak in predict method. If not, please explain what I'm doing wrong:
https://gist.github.com/ilivans/fb2d61d9b5bc3d82d3d0e6eb04cf4778
This script gives me the next output:

Using TensorFlow backend.
Generate data...
x_train shape: (512, 400)
x_test shape: (128, 400)
Build model...
Predict...
rss=142MB vms=569MB
2018-03-30 18:54:06.784917: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
rss=175MB vms=1187MB
rss=191MB vms=1191MB
rss=207MB vms=1195MB
rss=207MB vms=1195MB
rss=207MB vms=1195MB
rss=207MB vms=1195MB
rss=207MB vms=1195MB
rss=223MB vms=1195MB
rss=223MB vms=1195MB
rss=223MB vms=1195MB
rss=223MB vms=1195MB
rss=222MB vms=1195MB
rss=222MB vms=1195MB
rss=222MB vms=1195MB
rss=238MB vms=1200MB
rss=238MB vms=1200MB
rss=238MB vms=1200MB
rss=238MB vms=1200MB
rss=238MB vms=1200MB
rss=238MB vms=1200MB
rss=238MB vms=1200MB
rss=239MB vms=1200MB
rss=239MB vms=1200MB
rss=239MB vms=1200MB
rss=239MB vms=1200MB
rss=239MB vms=1200MB
rss=239MB vms=1200MB
rss=239MB vms=1200MB
rss=239MB vms=1200MB
rss=239MB vms=1200MB
rss=239MB vms=1200MB
...

Weird thing that the memory usage converges to some value with some number of predict calls.

Also I mentioned that the larger hidden size the larger memory leak, which is strange for me as well.

Thank you in advance.

Most helpful comment

All 9 comments

I've reproduced this behavior in pure TF, so probably need to address this question to TF gurus.
It is somehow connected with multi-thread mode (with inter_op_parallelism_threads=1 there are no excess allocations)

@ilivans even with inter_op_parallelism_threads=1, the memory consumption still increases. the increase is lower but still significant. I would vote to re-open the issue.

@abhiboost It's definitely not on the Keras side. I bet you can reproduce the same scenario in pure TF and observe the same behavior, so I think this is not the right place for the question. Probably there is an issue with TF dynamic memory allocation.

@ilivans did anyone ever make a TF issue? I've noticed the same issue as outlined here, except with dramatic memory increases per predict call depending on what device I run the ops on in docker containers.

@btaba not me, I've succesfully switched to PyTorch finally =)

haha thanks @ilivans

@ilivans did anyone ever make a TF issue? I've noticed the same issue as outlined here, except with dramatic memory increases per predict call depending on what device I run the ops on in docker containers.

Any clues?
I also get the mem leaks in a docker container but not when running on Windows (didn't try Linux w/o docker yet)

I'm using TF 1.8 (cpu) with Keras 2.2.0 (also tried 2.16 with no luck)

I made this ticket in tensorflow regarding this issue: https://github.com/tensorflow/tensorflow/issues/22098

Was this page helpful?
0 / 5 - 0 ratings