Deeplearning4j: RL4J does an image preprocessing step in the HistoryProcessor

Created on 27 Feb 2020  路  2Comments  路  Source: eclipse/deeplearning4j

It is causing trouble when I try to write a simple player sample that shows how to play an ALE game with a trained network.
The offending image resizing code is here:
https://github.com/eclipse/deeplearning4j/blob/master/rl4j/rl4j-core/src/main/java/org/deeplearning4j/rl4j/learning/HistoryProcessor.java#L111

It is called from only one location here. The observation gets resized before adding to the history.
https://github.com/eclipse/deeplearning4j/blob/master/rl4j/rl4j-core/src/main/java/org/deeplearning4j/rl4j/learning/HistoryProcessor.java#L59

And add only gets called from the LegacyMDPWrapper when it has a historyProcessor:
https://github.com/eclipse/deeplearning4j/blob/master/rl4j/rl4j-core/src/main/java/org/deeplearning4j/rl4j/util/LegacyMDPWrapper.java#L59
https://github.com/eclipse/deeplearning4j/blob/master/rl4j/rl4j-core/src/main/java/org/deeplearning4j/rl4j/util/LegacyMDPWrapper.java#L80

Refactored sample repo is here. Trainers work. Player does not.
https://github.com/RobAltena/ArcadeLearningEnvironment

How should we refactor this? I think RL4J should not have any preprocessing code. And samples should use our Datavec image resizing.

Pinging @AlexDBlack @saudet and @aboulang2002 for opinions as you have been working on refactoring RL4J. Thank you!

Bug RL4J

All 2 comments

Please check out pull #8711 and add your comments about that there!

I agree. I work to remove completely the HistoryProcessor. After pull #8711 (and the part 2, not yet in a PR), only .record() of HistoryProcessor will be used.
In another PR, I will remove HistoryProcessor and add a .record() to TransformProcess. It may seem weird that the TransformProcess does the recording, but I think some users will find it useful to be able to record at any step during the transform (for debugging / fine-tuning purposes).
Also, LegacyMDPWrapper is a wrapper I introduced to help me make the changes but stay compatible with the old code. It should be viewed as a temporary class that will be removed when the RL4J refac will be finished and most users will have migrated their code away from the old RL4J.

Was this page helpful?
0 / 5 - 0 ratings