Describe the bug
I found when updating the target network, current sac implementation only updates mlp on top of cnn_extractor, while the cnn_extractor parameters are never updated (code here).
source_params = get_vars("model/values_fn/vf")
target_params = get_vars("target/values_fn/vf")
Notice that cnn_extractor are defined under variable scope "model/values_fn" (code here), and "model/values_fn/vf" only covers the mlp on top of CNN encoding (code here).
I wonder why the target parameters update is like this.
Does it mean current implement assumes we already have shared the encoder between policy and target? But the default cnn_extractor=nature_cnn doesn't have such sharing.
Thanks a lot!
Hey, sorry for such a delay before reply!
You are correct, some of the parameters were not included in the "update target function" process (namely the convolutional layers). I have fixed this and will push a PR fixing this soon.
Most helpful comment
Hey, sorry for such a delay before reply!
You are correct, some of the parameters were not included in the "update target function" process (namely the convolutional layers). I have fixed this and will push a PR fixing this soon.