Keras: Multi GPU is ~50% slower than single GPU

Created on 22 May 2018 · 1Comment · Source: keras-team/keras

In my research, I use 2 types of AWS EC2 instances to train my models: p2.xlarge (1 GPU) and p2.8xlarge (8 GPUs). I noticed that using multi_gpu_model() on the 8xlarge instances actually results in a ~50% increase in training time per epoch over the xlarge instances.

The environment I use is the Deep Learning Amazon Linux AMI. Specs can be found here: https://aws.amazon.com/marketplace/pp/B077GF11NF?qid=1516817149793&sr=0-10&ref_=srh_res_product_title(url)

My code for the multi gpu instantiation is as follows:
with tf.device('/cpu:0'): base_model = ...
model = multi_gpu_model(base_model, gpus=8)

Has anyone encountered this issue before? Is this issue specific to how the AMI was set up?

Source