Ml-agents: Learn.py takes all GPU memory

Created on 11 Apr 2018  路  6Comments  路  Source: Unity-Technologies/ml-agents

Hello,

When I run a training using learn.py, the process allocated all the memory of the GPU.
Is there a way to avoid this, and make it takes only what it needs?

Thanks

help-wanted needs-info

Most helpful comment

I have seen this problem with OpenAI.Baselines when invoking a 2nd training run. Setting gpu_options.allow_growth = True fixed it for me

replace trainer_controller.py line 212 with tf.Session(config=config) as sess: with:

    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    with tf.Session(config=config) as sess:

Update: I tested this today and was able to run multiple training runs concurrently on a single GPU

All 6 comments

I dont think the problem is with tensorflow or the ml-agents (when i start training it uses about 20 mb vram)

You should check if your game is doing what you think it does. If you have some kind of memory leak you have to remember than the game is played 100 times as fast as normal, so that might amplify the problem.

Hi @r-lipton, is this using one of our sample environments or your own? Generally, as @Hengoo and @MarcoMeter pointed out, we haven't noticed this on our environments.

I have seen this problem with OpenAI.Baselines when invoking a 2nd training run. Setting gpu_options.allow_growth = True fixed it for me

replace trainer_controller.py line 212 with tf.Session(config=config) as sess: with:

    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    with tf.Session(config=config) as sess:

Update: I tested this today and was able to run multiple training runs concurrently on a single GPU

It's using my own created environment.
The solution of @Sohojoe worked for me, thanks!

Hi all. I've made a PR for this, and it will be added to the v0.5 release. https://github.com/Unity-Technologies/ml-agents/pull/1192

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Porigon45 picture Porigon45  路  3Comments

tensorgpu picture tensorgpu  路  3Comments

Rodnyy picture Rodnyy  路  3Comments

gerardsimons picture gerardsimons  路  3Comments

GeriBP picture GeriBP  路  3Comments