Models: batch-norm of inception v3 on multi GPU

Created on 16 May 2016  路  3Comments  路  Source: tensorflow/models

Dose every tower batch-norm on it's batch (part of batch in multi GPU mode),
or the Wx+b of all towers are concatenated to calculate batch-norm of batch*num_GPU eaxamples?
The latter maybe much slower due the sycronization.

Most helpful comment

does it mean the moving_mean and moving_variance on each tower will potentially be updated to different values even when the variables are shared across towers?
When we save the model, which tower's moving_mean/variance is saved?
Is there away to handle this correctly?

All 3 comments

Each tower performs batch_norm on its own part of the batch, there is no synchronization across towers for that.

@sun9700: Please reopen if that doesn't answer your question.

does it mean the moving_mean and moving_variance on each tower will potentially be updated to different values even when the variables are shared across towers?
When we save the model, which tower's moving_mean/variance is saved?
Is there away to handle this correctly?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dsindex picture dsindex  路  3Comments

frankkloster picture frankkloster  路  3Comments

hanzy123 picture hanzy123  路  3Comments

mbenami picture mbenami  路  3Comments

jacknlliu picture jacknlliu  路  3Comments