Deeplearning4j: Missing documentation on rl4j

Created on 2 Apr 2017 · 6Comments · Source: eclipse/deeplearning4j

Issue Description

I could find very limited documentation on details of rl4j.
I need to use deep q learning with experience replay on custom simulations.
Referring to the existing examples in rl4j I implemented an MDP interface for my simulation.
I created a QLearning.QLConfiguration as well as a DQNFactoryStdDense.Configuration.
Then I instantiated a DataManager and tried to start learning, but it throws exceptions, which probably means that there is some step that I am missing.
It would be great to have a documentation specific to what steps you need to do if you want to use rl4j with your custom MDP.
Also, while trying to implement the MDP interface, there are interfaces like ObservationSpace, ActionSpace etc. that are required.
However, there is no description about what the methods in these interfaces are supposed to do.
For example, there are methods getHigh() and getLow() in ObservationSpace, but no description on what 'high' and 'low' are. Hence, there is no way to know whether you have implemented the right thing.

It would be very helpful to have some documentation regarding the same.
Thank you!

Documentation help wanted

Source

taraliza

👍4

Most helpful comment

Is there documentation for this now?

raimannma on 19 May 2019

👍4

All 6 comments

Have you seen this https://deeplearning4j.org/reinforcementlearning ?

Or the examples
https://github.com/deeplearning4j/dl4j-examples/tree/master/rl4j-examples/src/main/java/org/deeplearning4j/examples/rl4j

Please take a look at those and let me know what details we are lacking.

tomthetrainer on 4 Apr 2017

@tomthetrainer Main thing is basic things like: How do you create custom environments? Whats an MDP?
How do you configure an RL algorithm? How does rl4j integrate with dl4j?

agibsonccc on 11 Apr 2017

👍2

@tomthetrainer It would be great if you could add details on what the different methods in the interfaces are meant to do, so that they can be implemented correctly for a custom MDP. (For example, methods getHigh() and getLow() are in ObservationSpace, but it isn't clear what low and high are). Also, none of the examples currently show how to load a pre-trained DQN and use it as a starting point for further learning.

taraliza on 11 Apr 2017

👍2

Is there documentation for this now?

raimannma on 19 May 2019

👍4

Still none…
Seems I have to migrate to Pytorch to continue my Project. :(

Storm-cev on 26 Dec 2019

I am trying to create a custom environment using rl4j. But due to the lack of documentation, i was not able to create a MDP for my environment. As per the example programs, i also see that Actions for each observation space/state are set randomly.
It would be helpful if someone could explain on how to set custom actions and observations for custom environments.