Deeplearning4j: DL4J KerasSequentialImport problem

Created on 12 Feb 2020  路  5Comments  路  Source: eclipse/deeplearning4j

Issue Description

Simple LSTM models trained using Keras give wrong predictions after imported in Java using Keras Sequential Import.
The model start with a Masking Layer, then LSTM layer, and Dense Layer.
When the model is small untrained, the java imported version would give the same predictions as the python version.
But whenever I provide a trained model or relatively larger model. I get a complete mismatch.

To make sure both python and java model takes the same input, I preprocessed sample input in python, saved them into the npy file after transposing the second and the third dimension. Then in Java, I used Nd4j.createFromNpyfile to create the same input. Then import the h5 file saved from python.

So both the python and java load model from same h5 file, same input npy file except the second and third dimension transposed. They disagree completely on the result.
Only when the model is extremely small or untrained, they do match.

I've put two model files in the github, model_latest.h5 is the one wouldn't match, model_toy.h5 is the one that would match. the test_matrix.npy is the sample input read in by both python and java side.
https://github.com/tintinxue1/dl4j_KerasImport/tree/master

Version Information

  • Deeplearning4j version ==> 1.0.0 beta-6
  • Platform information (OS, etc) ==> MacOs
  • CUDA version, if used ==> not applicable
  • NVIDIA driver version, if in use. ===> not applicable
Bug DL4J Keras

Most helpful comment

"Only when the model is extremely small or untrained, they do match"

Try giving the larger model a random input and you will find the outputs match for the larger model too. This is a bug in our keras import of the masking layer. With a random input where none of the time steps have all values in the input equal to the mask_value the mask layer does nothing in keras and that is why the outputs are equal.

All 5 comments

Originally started on https://community.konduit.ai/t/imported-keras-lstm-layer-mismatch/124.

Even though this looks like it might be a bug, it would be better if you had waited to get a confirmation of it instead of cross posting.

"Only when the model is extremely small or untrained, they do match"

Try giving the larger model a random input and you will find the outputs match for the larger model too. This is a bug in our keras import of the masking layer. With a random input where none of the time steps have all values in the input equal to the mask_value the mask layer does nothing in keras and that is why the outputs are equal.

Tagging @AlexDBlack as per Paul's instructions to assign someone (me?) to work on the fix for the bug.
Unit test in repo linked here

In the unit test note that the model passes for the random input and doesn't for the other due to the bug I described earlier.

Using the SequenceRecordReader with the Alignment made it work, although i have to write the inputs into txt file and read them back in. but It'll do for now, hope you can fix the bug soon. thanks

Was this page helpful?
0 / 5 - 0 ratings