Keras: Running Keras in a multi-node/multi-machine environment

Created on 26 Jul 2015  路  5Comments  路  Source: keras-team/keras

Has anyone tried running Keras at a larger scale, such as on a cluster or in an HPC environment? I have access to such an environment and would be interested to understand whether Keras can be used effectively in such an environment? Perhaps it would require using intermediate outputs stored on disk such as with joblib to break up the processing?

stale

Most helpful comment

You can check out the Elephas project for Keras parallelization on Spark: https://github.com/maxpumperla/elephas

All 5 comments

What sort of parallelism are you looking to do? Data parallelism would be very easy to setup. Model parallelism would require some changes.

I ran a derivative of the char-rnn successfully on our cluster via a PBS queue system, but only on a single machine with 16 cpu cores (we have no GPUs here yet). I would be interested to find out if it is possible to train a model on several nodes at once, but it might be that this is not supported by theano.

You can check out the Elephas project for Keras parallelization on Spark: https://github.com/maxpumperla/elephas

thanks, I will check it out!

Dear, sir, How to train a model on several nodes?@tleeuwenburg

Was this page helpful?
0 / 5 - 0 ratings