When using the predict method of an algorithm, yes it should only be used in recurrent policies. It allows the LSTMs to reset their internal values when the environment resets.
>All comments
When using the predict method of an algorithm, yes it should only be used in recurrent policies. It allows the LSTMs to reset their internal values when the environment resets.
Most helpful comment
When using the predict method of an algorithm, yes it should only be used in recurrent policies. It allows the LSTMs to reset their internal values when the environment resets.