Stable-baselines: What would be a good library that implements tabular RL algorithms?

Created on 14 Nov 2020 · 3Comments · Source: hill-a/stable-baselines

Apparently, all famous RL libraries seem to focus on deep RL, and they (including stable baselines) don't seem to provide any implementation of common tabular RL algorithms, including tabular Q-learning, which is able to deal with different types of states (e.g. images, x-y coordinates, configurations of the board, etc.) and actions.

As a confirmation, does stable baselines support any of these tabular RL algorithms? If not, what would be an RL library that implements tabular RL algorithms?

Of course, you could say: "Just implement these algorithms yourself", given that they are very simple and you don't have to deal with the hassles of neural networks and stuff, but, well, I would like to avoid implementing them (at least for now), especially, if there is already a good/reliable library that supports many use cases. Yes, there are Github projects, such as this one, but they are not really libraries, but just simple implementations, which are probably not even efficient. Moreover, I know that this not strictly related to stable baselines, but I think that some people that are interested in applying RL may also have this question when they come across stable-baselines (although everyone seems to love just deep RL).

question

Source

nbro

All 3 comments

Hello,
You should probably take a look at MushroomRL and GenRL, although I never tested them myself.
But as you mention, if you really need only tabular algorithms, best is to re-implement them yourself.

araffin on 14 Nov 2020

👍1

Perhaps these could be a fine addition to stable-baselines3-contrib? On a quick note I could not think of bigger limitations on including these.

Miffyli on 15 Nov 2020

👍1

Perhaps these could be a fine addition to stable-baselines3-contrib?

I'm not sure it is a good fit... as the SB3 api won't work ("MlpPolicy", ....) and as the potential application seem quite limited.
I would rather favor a FQI implementation.

araffin on 15 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings