models/video_prediction @cbfinn
Thank you for generously sharing the code! I have three questions about the released code:
prediction_train.py? in particular the number of training steps.
are the hyperparameters used in the paper the same as the default options in prediction_train.py?
For the most part, yes. There are a few differences:
I observed some strange val_loss trend line so i wonder if i made a mistake.
That curve is about what I would expect. It looks strange is because of scheduled sampling, a curriculum which stochastically passes in ground truth frames at some times during the beginning of training. The curriculum ends around 12k. (See citation [2] in paper for details). To turn off scheduled sampling, you can set --schedsamp_k=-1
Alternatively, you could make a change to the code to set schedsamp_k=-1 for the validation model, regardless of what's used for the training model. This might be nice.
can you share some figures on the expected performance of the trained model over the val/train sets?
I did this work when I was an intern at Google Brain, and I no longer have access to data/code/training curves that I used for the paper.
is there a plan to also release the valuation/visualization script for the model?
I'm not planning on doing this in the immediate future, but love to have something like this added to the released code. I'd be happy to help review code for this, and potentially add to it. For example, I think that tiling animated gifs is a great way to visualize the model's predictions, as seen here: https://sites.google.com/site/robotprediction/ (scroll down about halfway). I have the code for tiling predictions together and saving into a gif, which I'd be happy to share.
It would also be really useful to visualize the gifs during training, e.g., in tensorboard (https://github.com/tensorflow/tensorflow/issues/3936)
Thanks for the response @cbfinn.
@falcondai : seems you got the answers you were seeking.
Closing this out. If you have more concerns, please do file a new issue/check with @cbfinn
@cbfinn Thanks for the clarifications and pointers! i will follow up with more specific issues should they arise.
@cbfinn
For your previous commentary, how can I get tiling animated gifs visualizing result of model's prediction? I have been tried to analyze and modify input and training files, but I couldn't do well. Can I get any help for that?
Here's an example script that loads images from the pushing dataset and exports them to gifs, using the moviepy package (though does not tile them).
grab_train_images.py.zip
It is straightforward to use moviepy to stack gifs side-by-side, to form a tiling.
http://zulko.github.io/moviepy/getting_started/compositing.html#stacking-and-concatenating-clips
@cbfinn @falcondai
Thanks for your generous reply! I will follow rest of the codes with respect to included codes :)
@tegg89 i ended up using imageio for creating GIF. Its API is pretty straightforward. For an example (ipython notebook): https://gist.github.com/falcondai/1e22919e6ce8d6a8e3dd3da5a6a0ad94
@cbfinn @falcondai
When I put data to network for evaluation, result GIF file is created that is not sequential.
I have switched shuffle option in prediction_input.py to disable.
@tegg89 Make sure you are only calling session.run() once for the entire sequence, rather than once for each frame. The script grab_train_images.py shows how to extract a sequence of images in order, with a single sess.run() per sequence.
@cbfinn @falcondai
Sorry for bothering to keep ask you questions. But I have keep troubles on visualizing test data.
Referring to grab_train_images.py, I have changed the input file that returns sequential video frames. However, when I have put this file through network model, the gen_images came out with not sequential form. Modified code is in here. The steps
that I ran through are followed:
prediction_input.py (I already checked that images are shown sequential order)Model class in prediction_train.py.gen_images = sess.run([model.gen_images], feed_dict={model.iter_num: -1})learning_rate term from feed_dict is deleted because of not using)gen_imagesgen_images to gifThen it came out with no sequential form. In my opinion, the network model makes input file not sequential form. How did you do visualizing evaluation?
@cbfinn Thanks for your paper and codes. And sorry to bother you for a little detail.
train_val_split is 0.95. And I saw that this tf version also uses the same setting as default.validation psnr (I use this evaluation) is not always accordance with test psnr. I choose the best model through selecting best validation psnr when training. But sometimes, some periodic checkpoint models' psnr are higher than the selected best model (gap up to 0.5).Is train_val_split == 0.95 not enough in practice?
@carsonDB The percentage is the same, but the actual videos used for training and validation are different (as they are randomized).
@cbfinn Thanks for your quick reply!
Most helpful comment
For the most part, yes. There are a few differences:
That curve is about what I would expect. It looks strange is because of scheduled sampling, a curriculum which stochastically passes in ground truth frames at some times during the beginning of training. The curriculum ends around 12k. (See citation [2] in paper for details). To turn off scheduled sampling, you can set --schedsamp_k=-1
Alternatively, you could make a change to the code to set schedsamp_k=-1 for the validation model, regardless of what's used for the training model. This might be nice.
I did this work when I was an intern at Google Brain, and I no longer have access to data/code/training curves that I used for the paper.
I'm not planning on doing this in the immediate future, but love to have something like this added to the released code. I'd be happy to help review code for this, and potentially add to it. For example, I think that tiling animated gifs is a great way to visualize the model's predictions, as seen here: https://sites.google.com/site/robotprediction/ (scroll down about halfway). I have the code for tiling predictions together and saving into a gif, which I'd be happy to share.
It would also be really useful to visualize the gifs during training, e.g., in tensorboard (https://github.com/tensorflow/tensorflow/issues/3936)