Keras: run Keras on multiple CPUs in parallel?

Created on 29 Mar 2016 · 8Comments · Source: keras-team/keras

I am wondering: can I run Keras on a cluster of CPUs in parallel? Since currently I am not having GPU resources but I have CPU-cluster, I am wondering can I manage my Keras program and run it on many CPUs?

And do you have some ideas about the speed between GPU (let's say Amazon EC2 g2.2xlarge instance) and a CPU cluster (let's say 10 cpus).

I have also seen this tool elephas . There are some usage-of-data-parallel-models. I am wondering what is the _data parallel_ meaning here?

Does it mean that during each of the epoch, the many batchs are paralleled and trained on different workers?

stale

Source

fluency03

👍1

Most helpful comment

It seems like a big project running Keras on a cluster of CPUs or distributed platforms.
Even though, the speed-up obtained from a CPU-cluster will be not as good as some normal level GPUs because of the dense matrix calculation.

fluency03 on 30 Mar 2016

👍2

All 8 comments

And do you have some ideas about the speed between GPU (let's say Amazon EC2 g2.2xlarge instance) and a CPU cluster (let's say 10 cpus).

This: https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py

Intel i7, 6x 4Ghz: 257.03 samples/s, ETA: 00:58:21
GPU Nvidia 680: 2,094.22 samples/s, ETA: 00:06:29

Just to give you some rough numbers.

marcj on 29 Mar 2016

👍2

@marcj . The comparison I would like to have actually is: GPU vs. CPU-cluster.

it seems like you are using single CPU.

fluency03 on 29 Mar 2016

It uses all 6 cores. So, what you see: A single very old GPU (2012) is usually way faster than a CPU cluster (with at least human cpu core counts, like 10-32)

marcj on 29 Mar 2016

👍2

The CPU-cluster I mean here is, for example, 10 CPUs where each of them is an Intel i7.

fluency03 on 29 Mar 2016

I see, well then you could calc the rough ETA: 60mins/1cpu (with 6 real cores) == 6mins/10cpus. Although it will be slower, since multithreading comes with a cost, even under perfect conditions you run as fast as a 4 year old single GPU with your 10 cpu cluster. The only way to get better numbers is to run mnist_cnn.py for example on your cpu and gpu cluster and compare. :P I guess this should be enough to see that GPU is way faster. Btw, multithreading in Keras is usually done by using OpenMP in Theano: http://deeplearning.net/software/theano/tutorial/multi_cores.html

marcj on 29 Mar 2016

That may be not the part of Keras, I think you should look for how TensorFlow/Theano supports it.