Imitation learning use the backpropagation algorithm. Q- learning using a methods of rewards. And in the ml-agents we can use imitation learning. Where is the connection in ml agents between backpropagation algorithm and q-learning ? Is imitation learning backpropagate the q-function of q-learning ? Or i don't understand something, because in documentation nothing said about algorithm of imitation learning.
Immitation learning is an alternative to Reward based learning (Q-Learning). So use one or the other at any given time for the various tasks you have.
ml agents use PPO, not Q learning. Both imitation learning or PPO train a neural net.
mlagents is not limited to PPO as a reward based approach, it is just offered as the toolkit's default, but you could use whatever reward based function you can implement, or better, utilizing whatever deep learning framework you prefer (tensorflow is just cooked in by default - and should be removed/decoupled in my opinion).
The foundation of mlagents python toolkit is its IPC between the unity ml agents environment and external applications (glue code for training).
Immitation learning is an alternative to Reward based learning (Q-Learning). So use one or the other at any given time for the various tasks you have.
You mean, that Imitation learning directly show to agent what reward he will receive if certain data are entered?
Almost, it doesn't need the reward, because the neural net is mapping the action to the state - based on your action-state recordings provided
Imitation learning is basically supervised learning on observations -> actions
There we go.
Imitation learning is basically supervised learning on observations
What kind of algorithm is used to set neural network weights in imitational learning ? Backpropagation or something else? i mean in ml agent. Thank you :)
@ervteng
Hi all -- this issue has been inactive for some time so I'm going to close it. Feel free to reopen or create a new issue if you have more to discuss.