I am wondering: can I run Keras on a cluster of CPUs in parallel? Since currently I am not having GPU resources but I have CPU-cluster, I am wondering can I manage my Keras program and run it on many CPUs?
And do you have some ideas about the speed between GPU (let's say Amazon EC2 g2.2xlarge instance) and a CPU cluster (let's say 10 cpus).
I have also seen this tool elephas . There are some usage-of-data-parallel-models. I am wondering what is the _data parallel_ meaning here?
Does it mean that during each of the epoch, the many batchs are paralleled and trained on different workers?
And do you have some ideas about the speed between GPU (let's say Amazon EC2 g2.2xlarge instance) and a CPU cluster (let's say 10 cpus).
This: https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py
at
Intel i7, 6x 4Ghz: 257.03 samples/s, ETA: 00:58:21
GPU Nvidia 680: 2,094.22 samples/s, ETA: 00:06:29
Just to give you some rough numbers.
@marcj . The comparison I would like to have actually is: GPU vs. CPU-cluster.
it seems like you are using single CPU.
It uses all 6 cores. So, what you see: A single very old GPU (2012) is usually way faster than a CPU cluster (with at least human cpu core counts, like 10-32)
The CPU-cluster I mean here is, for example, 10 CPUs where each of them is an Intel i7.
I see, well then you could calc the rough ETA: 60mins/1cpu (with 6 real cores) == 6mins/10cpus. Although it will be slower, since multithreading comes with a cost, even under perfect conditions you run as fast as a 4 year old single GPU with your 10 cpu cluster. The only way to get better numbers is to run mnist_cnn.py for example on your cpu and gpu cluster and compare. :P I guess this should be enough to see that GPU is way faster. Btw, multithreading in Keras is usually done by using OpenMP in Theano: http://deeplearning.net/software/theano/tutorial/multi_cores.html
That may be not the part of Keras, I think you should look for how TensorFlow/Theano supports it.
It seems like a big project running Keras on a cluster of CPUs or distributed platforms.
Even though, the speed-up obtained from a CPU-cluster will be not as good as some normal level GPUs because of the dense matrix calculation.
Have you already tried elephas?
Most helpful comment
It seems like a big project running Keras on a cluster of CPUs or distributed platforms.
Even though, the speed-up obtained from a CPU-cluster will be not as good as some normal level GPUs because of the dense matrix calculation.