Keras: Why keras use tf.Variable instead of tf.get_variable in tensorflow_backend K.variable

Created on 26 Sep 2017  路  9Comments  路  Source: keras-team/keras

Hi, everyone,
As far as I know, using tf.Variable can't create shared variables which are useful in many tasks such as multi gpu processing.
Is there anything I missed indicating that tf.Variable is more suitable than tf.get_variable in tensorflow_backend.
Thanks a lot!

Most helpful comment

get_variable and variable scopes are a user-facing TF API designed to allow for variable sharing via name-referencing in a global variable namespace.

In Keras, variable sharing in handled differently: by having layers be objects that hold their own variables. Calling a layer multiple times reuses its variables.

In the Keras backend, there is no reason to use this user-facing API since it doesn't solve any problem we have and it adds a lot of hidden complexity over a plain variable instantiation. And importantly, using get_variable in place of Variable would break a lot of things (naming, etc) because Keras does not use variable scopes anywhere.

All 9 comments

@dongfangyixi if you can create a PR correcting this and specifically articulate exactly why the new design is better I expect the author would accept the improvement.

get_variable and variable scopes are a user-facing TF API designed to allow for variable sharing via name-referencing in a global variable namespace.

In Keras, variable sharing in handled differently: by having layers be objects that hold their own variables. Calling a layer multiple times reuses its variables.

In the Keras backend, there is no reason to use this user-facing API since it doesn't solve any problem we have and it adds a lot of hidden complexity over a plain variable instantiation. And importantly, using get_variable in place of Variable would break a lot of things (naming, etc) because Keras does not use variable scopes anywhere.

@fchollet that approach does appear to have some negative consequences according to @ppwwyyxx, quoted below from https://github.com/ppwwyyxx/tensorpack/issues/160#issuecomment-335960211:

The fundamental reason is that calling tf.Variable will ignore variable scope reuse, which is the fundamental mechanism of multigpu training. I'm not sure is this necessary for Keras or is it possible to change. There is actually a question on Keras about this: fchollet/keras#7992

The current way to build keras model inside tensorpack isn't too bad IMHO. Apart from that maybe more things can be done on tensorpack side, e.g. integrate with Keras loss/regularization automatically.

Keras supports TF device scopes and multi-GPU training works just fine...

@ppwwyyxx could you comment? I'm not familiar enough with the details to articulate accurately regarding this issue.

@ahundt It's more of a design choice in my opinion. Multi GPU still works in my example. It just has to work around some issues (caching the Keras model) due to the difference use of variable scope.

thanks! guess based on that this should remain closed

I am making a GAN in tensorflow but I am using keras LSTM layers. How would I go about variable sharing in this case(as I am using variable scopes). What about training generator and discriminator separately(which I also use variable scopes for)?

@dongfangyixi I am using tf.Varaible with keras, but how I could initialize using Xaiver. in tf.get_variable, I was able to initialize it by this way. So how I can initialize some random variable using Xaiver initializer.

Was this page helpful?
0 / 5 - 0 ratings