Keras: Add LRN-Layer again

Created on 12 May 2017 · 5Comments · Source: keras-team/keras

LRN has been removed, maybe in favor of Batch Normalization. The problem is, in my experiments BN layer after a Conv Layer didn't perform well. It performs well on the 5th and 6th Dense layer, but not on the Conv layer
So, why don't you let the user the choice what they want to use?

stale

Source

StefanoD

👍7

Most helpful comment

I am also having performance issues with the BatchNormalization layer.

I created a small code that can reproduce my issue (and included the output) here: https://gist.github.com/ma1112/8c118d8584da9eb5637053193790bb47

In short, my (regression) network is [Conv, BatchNorm, Activation, MaxPool] x 6, Flatten, Dense, BatchNorm, Activation, Dense

Omitting the BatchNorm layers after the Conv layers result in a 7x time faster training [148 sec instead of 1102 sec per epoch] and 52x faster prediction [2.2sec instead of 120 sec] on a Tesla K80.

ma1112 on 17 May 2017

👍3

All 5 comments

I am also having performance issues with the BatchNormalization layer.

I created a small code that can reproduce my issue (and included the output) here: https://gist.github.com/ma1112/8c118d8584da9eb5637053193790bb47

In short, my (regression) network is [Conv, BatchNorm, Activation, MaxPool] x 6, Flatten, Dense, BatchNorm, Activation, Dense

Omitting the BatchNorm layers after the Conv layers result in a 7x time faster training [148 sec instead of 1102 sec per epoch] and 52x faster prediction [2.2sec instead of 120 sec] on a Tesla K80.

ma1112 on 17 May 2017

👍3

Me too.Finally someone talks about it. When I add bn layers, the performance goes down about 2x times.The networks with BN layers are too slow, maybe you should do something to make it better.

Hazarapet on 17 May 2017

Maybe this is the reason for the slowdown.

StefanoD on 17 May 2017

After reading through the comments from the last week, I finally figured out myself what my problem was with my code above.
It seems, that tensorflow really needs the data_format to be set to channels_last. In my example code above, I used the channels_first setting. After switching to channels_last [and modifying the shape of the input array, axis of the BN layer], the training time deceased significantly [28 sec from 157 sec per epoch] even without the BN layers. Moreover, adding the BN layers have minimal effect now wrt the training time.[37 sec instead of 28, with 6 added BN layers]

ma1112 on 29 May 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.