Keras: Keras 2 requests for contribution

Created on 11 Feb 2017  路  29Comments  路  Source: keras-team/keras

There is an early draft branch available for Keras 2: https://github.com/fchollet/keras/tree/keras-2
Note that the codebase will keep evolving a lot. Support for backwards compatibility with Keras 1 will only be added once the new codebase is finalized.

In this thread I will be posting very specific requests for contribution.

[DONE] Fix Theano Conv2DTranspose layer

The Deconvolution layer has become Conv2DTranspose, with a simplified API that does not require users to specify an output shape (that was a big issue with the previous API). The new implementation works fine with TensorFlow, but appears to be broken with Theano (most likely the backend function conv2d_tranpose). I haven't had time to look into it, feel free to check it out and fix it.

[MOSTLY DONE] Unit tests

The API is changing and the previous unit tests all need to be ported to the new API. I haven't started to look into this. Feel free to pick any unit test file and update it.

Update examples to the new API

Likewise.


If you have feedback, bugs to report, or if you're opening PRs against the keras-2 branch, post in this thread.

contributions welcome

Most helpful comment

It is not possible to fit a Keras model on TF tensors, because that would involve modifying the existing computation graph, which is impossible, or creating a new one altogether, which has undesirable side effects. What you can do, however, is build your model on top of inputs that are TF tensors. You can still use fit in that context (instead of expecting data to be provided as Numpy arrays, the model will pull data from the relevant graph nodes automatically).

However... there is little rationale for using Keras fit at this point. If your workflow is TF-based, you should train your Keras models using TF training loops (e.g. TF Experiment which will be compatible with Keras models). fit will always primarily be for arrays and Python generators.

All 29 comments

For the record, I estimate progress to be about 50%, not counting updating the docs, examples and unit tests.

Git diff with master: 85 files changed, 5060 insertions(+), 5332 deletions(-)

Hi @fchollet, going through some of your updates now. I wanted to verify a couple things:

  1. You seem to be renaming nb_X (e.g. nb_classes) to num_X. Consistently do this across?
  2. objectives -> losses
  3. TimeDistributedDense is gone, so replace with TimeDistributed(Dense(...)) where appropriate?

@braingineer yes, that's all correct.

Note that TimeDistributedDense had been deprecated for ~6 months already.

Regarding nb_*, there may be a few exceptions where we rename things to simply *s. E.g. nb_epoch would become epochs. But, yes, generally nb_* becomes num_*.

Thanks!

Note that TimeDistributedDense had been deprecated for ~6 months already

Ah. I personally deviated from the main branch on wrappers a while ago so I never touched based on this. I had just noticed that it was still being used in a bunch of the tests.

I've run out of cycles to push on this until Sunday evening/Monday. if someone else wants to push on it, I can start a PR and they can pick up where I left off.

@fchollet: There's a small fix for the Conv2DTranspose layer with Theano in #5382.

At present, there's a zoo of recurrent layers (SimpleRNN, LSTM, GRU, ConvLSTM). By exposing the recurrent API and making use of the existing graph building machinery, one could construct arbitrary recurrent structures without resorting to backend functions. In my mind, this is how the API would look like:

from keras.layers import Recurrent, Input, InputState

def LSTM(num_inputs, num_cells):
    """Handmade LSTM cell."""
    # x_t.shape = (batch_size, num_inputs)
    x_t = Input(shape=(num_inputs,))
    C_t_minus_1 = InputState(shape=(num_cells,))
    h_t_minus_1 = InputState(shape=(num_cells,))

    # LSTM Equations
    C_t = ...
    h_t = ...

    return Recurrent(inputs=x_t, state_updates={C_t_minus_1: C_t, h_t_minus_1: h_t})

# x.shape = (batch_size, num_time_steps, num_inputs)
x = Input(shape=(10, 42))
y = LSTM(num_inputs=42, num_cells=100)(x)

Of course, we'll need Multiply and Add layers to afford full flexibility, but that comes for free given Lambda. I can contribute if required.

New requests for contribution on Keras 2:

  • [FIXED] Fixing ConvLSTM2D. Currently broken, haven't looked into it.
  • Making it possible to set the initial states of recurrent layers, either by value (from numpy array) or symbolically (from tensors that come from somewhere else in the graph), and writing an example script for seq2seq.
  • Making sure all layers that can support masking, do (e.g. TimeDistributed should)
  • Making sure losses and metrics get properly masked in training.py.

Also:

  • Fix issue with shape inference in SeparableConv2D layer when using kernel dilation.

Keras 2 is getting there, with nearly all unit tests passing and few bugs left.

One thing that is easy to do, would be very helpful, and needs to be done quickly, is to convert the example scripts from examples/ to the new API. I have done a few, but about 2/3rds are left.

git diff master --stat:
178 files changed, 13760 insertions(+), 13242 deletions(-)

Reminder: you too can help, and it's easy. See above.

So, I hope to do at least a few of the example conversions. Are you also thinking that this should be the time we double-check the validity of the doc strings and everything? It seems like we should

a) make sure the examples work as described in 1.2.2
b) migrate to K2 and again make sure that it still works

..which may be pretty time consuming (validating before/after, both backends, tracking down any issues).

Are you also thinking that this should be the time we double-check the validity of the doc strings and everything?

Yes, that would be quite helpful.

Any time constraint? You mentioned you'd like this aspect done quickly.

There are no time constraints. But do keep in mind that if something is pending for a while, someone else (or myself) may push it before you do.

Sure - just curious if you were pushing a March 1st release date or something similar.

We will ship it when it's ready. March 1st may be a bit early. But probably the first week of March.

Would there be interest in multiple dataset backends so functions like Model.fit() can be extended?

For example formats like TFRecord could be supported (among others) in addition to hdf5. Here is a starter TFRecord pull + discussion in keras-contrib: https://github.com/farizrahman4u/keras-contrib/pull/27

Also on that topic, https://github.com/fchollet/keras/pull/5445 explains the new pull request procedures for Keras 1, Keras 2, and keras-contrib as I understand them.

Would there be interest in multiple dataset backends so functions like Model.fit() can be extended?

Yes, that would be useful. TFRecord in particular is relevant. But that can be added later.

Another important thing to do: fix Theano-related errors. Tests are all passing with TensorFlow, but some tests are failing with Theano. I have no time to look into it right now.

@fchollet

Making it possible to set the initial states of recurrent layers, either by value (from numpy array) or symbolically (from tensors that come from somewhere else in the graph)

This and much more has already been done in recurrentshop. Maybe we can integrate the whole thing into Keras? (including timestep wise RNN building, depth first computation, readout etc)

@farizrahman4u in general I am interested in integrating some of the features of recurrentshop into Keras. However this won't happen in Keras 2. But one feature that we need right away is just the ability to set initial states of RNNs.

Would there be interest in multiple dataset backends so functions like Model.fit() can be extended?

Yes, that would be useful. TFRecord in particular is relevant. But that can be added later.

Ideally the x,y params of model.fit() and other similar nparray params would support tensors of whatever api/backend for performance reasons. For example, in addition to TFRecords, tf tensor ops can be implemented to load data directly from imaging hardware or graphical simulations for RL.

Might tensor param support be worth design consideration for the 2.0 spec to avoid breaking changes, even if it is merely a no-op to be implemented after 2.0 is released?

@fchollet
To allow numpy array initial states, you can simply add an argument to the RNN constructor, and should be properly serialized in layers config.
To allow initializing states with symbolically, you might have to allow dict input to layers (this is how it is done in recurrentshop) . So you can call a RNN like this : y = LSTM(5)({'input': x, 'h' : tensor1, 'c' : tensor2}).

Fix issue with shape inference in SeparableConv2D layer when using kernel dilation.

Is there an example? @fchollet

@fchollet How about supporting more functionalities of tensorboard? All sort of summaries and improved graph?

Like, I can add name field in keras layers, so when we generate graphs from tensorboard callback, graphs will be much cleaner with name of each layer block.

It is not possible to fit a Keras model on TF tensors, because that would involve modifying the existing computation graph, which is impossible, or creating a new one altogether, which has undesirable side effects. What you can do, however, is build your model on top of inputs that are TF tensors. You can still use fit in that context (instead of expecting data to be provided as Numpy arrays, the model will pull data from the relevant graph nodes automatically).

However... there is little rationale for using Keras fit at this point. If your workflow is TF-based, you should train your Keras models using TF training loops (e.g. TF Experiment which will be compatible with Keras models). fit will always primarily be for arrays and Python generators.

If your workflow is TF-based, you should train your Keras models using TF training loops (e.g. TF Experiment which will be compatible with Keras models).

I've been running keras + TF backend, I just now happen to have a TFRecord dataset. I was just giving examples of other data sources where converting to numpy arrays is expensive. If fit() will always be numpy arrays I'll follow your advice to use TF directly, or convert the TFRecord to the accepted format. I did like the idea of switchable dataset backends in addition to model backends. Oh well, thanks for the advice!

Another important thing to do: fix Theano-related errors. Tests are all passing with TensorFlow, but some tests are failing with Theano. I have no time to look into it right now.

@fchollet I am working on theano related errors.

It seems this issue has been resolved. We can open another one for clarity if the need arises to ask for more help from the contributors, detailing exactly what is left to do.

Was this page helpful?
0 / 5 - 0 ratings