Ml-agents: Imitation learning algorithm and q-learning

Created on 13 May 2019 · 10Comments · Source: Unity-Technologies/ml-agents

Imitation learning use the backpropagation algorithm. Q- learning using a methods of rewards. And in the ml-agents we can use imitation learning. Where is the connection in ml agents between backpropagation algorithm and q-learning ? Is imitation learning backpropagate the q-function of q-learning ? Or i don't understand something, because in documentation nothing said about algorithm of imitation learning.

discussion help-wanted

Source

FreeRP

All 10 comments

Immitation learning is an alternative to Reward based learning (Q-Learning). So use one or the other at any given time for the various tasks you have.

tjad on 14 May 2019

👍1

ml agents use PPO, not Q learning. Both imitation learning or PPO train a neural net.

roboserg on 14 May 2019

👍1

mlagents is not limited to PPO as a reward based approach, it is just offered as the toolkit's default, but you could use whatever reward based function you can implement, or better, utilizing whatever deep learning framework you prefer (tensorflow is just cooked in by default - and should be removed/decoupled in my opinion).

The foundation of mlagents python toolkit is its IPC between the unity ml agents environment and external applications (glue code for training).

tjad on 14 May 2019

Immitation learning is an alternative to Reward based learning (Q-Learning). So use one or the other at any given time for the various tasks you have.

You mean, that Imitation learning directly show to agent what reward he will receive if certain data are entered?

FreeRP on 14 May 2019

Almost, it doesn't need the reward, because the neural net is mapping the action to the state - based on your action-state recordings provided

tjad on 14 May 2019

👍1

Imitation learning is basically supervised learning on observations -> actions

roboserg on 14 May 2019

👍1

There we go.

tjad on 14 May 2019

Imitation learning is basically supervised learning on observations

What kind of algorithm is used to set neural network weights in imitational learning ? Backpropagation or something else? i mean in ml agent. Thank you :)

FreeRP on 14 May 2019

@ervteng