Hi, taking Pong_v0 as example, there are plenty of examples to train RL agent to play the game vs the game bot, I also found out that play.py is a script that allows human player to play against the game bot. However, I am wondering, is it possible to replace the game bot with the trained RL agent against a human player? After I have obtained all of the optimal hyperparameters and optimal weightages for my neural network, what should I do next?
In short, I have a trained RL bots, and I wish I could play against it.
https://github.com/koulanurag/ma-gym
This is a Multi-Agent API for Gym. Have a look at the wiki,
https://github.com/koulanurag/ma-gym/wiki/Usage#customizing-an-environment. Give it a try.
Hi, thank you, seems really useful for me, but after I have read through the scripts and documentation, I have come up with some questions.
The atari gym environments may not have the multiplayer setup. The gym retro environments should have it though: https://github.com/openai/retro When you instantiate the RetroEnv instance you can specify the number of players: https://retro.readthedocs.io/en/latest/python.html#retro.RetroEnv
import retro
env = retro.make('Pong-Atari2600', state='Start.2P', players=2)
obs = env.reset()
The action space can be a little confusing, you'll have to figure out how it maps which keys to which players, but it should be doable.
In general though, this is a sort of per-environment capability. Most environments are not 2 player, and 2 player environments may or may not support a single player mode or playing against humans.
Thanks. I have figured out which bit in the MultiBinary in the action space maps to which player in gym retro, but now my problem is how to get the keyboard input. In the atari gym environment, there is a function get_keys_to_action (according to my understanding based on play.py script), but there is no such function or API in gym retro. Do you have any suggestion on how can I get the player's input from keyboard?
Hey, @nfkok, I am working on something very simillar to you on the PongNoFrameskip-v4 environment - trying to play as the brown paddle against my DQN trained bot on the green paddle - and stumbled upon this issue. Did you have any luck with this approach? Any advice would be much appreciated! :)
@epiicme I used pygame to map the action space to the keyboard in retro Pong-Atari2600. I migrated everything from gym to retro.
But now I have another problem, the agent that I trained is not improving even after 3000 episodes in retro. So even if I have successfully integrated the pygame control into the game, my agent is not trained. In gym Pong-v0, the agent played very well during inferencing (without backprop anymore) after a day of training, but it does not work in retro Pong-Atari2600.
May I know did you wrote your own script for DQN or using tensorflow? Can you try to train your agent in retro's Pong-Atari2600 and tell me whether the agent is learning or not?
I hope we can discuss more on this, thanks.
@nfkok, thanks for the info. The DQN script I wrote is based on a book called "Deep Reinforcement Learning Hands-On" by Maxim Lapan, and it's completely in PyTorch.
That's a good, idea, assuming the conversion from gym to retro isn't too much work. I'll try to see if it can train just as well on Pong-Atari2600. Could you let me know how difficult it was to change libraries?
@epiicme migrating to retro is almost the same, just the action space is different which is in index and not in discrete as in gym and retro only supports python3. You can try to prepare 2 environments, one for training by setting single player, and then inference it in another 2-player environment with built-in PYGAME keyboard input. Quite straight forward.
During training, what I did in retro Pong was actually implementing the same thing as Andrej Karphaty did, same pre-processing, policy forward, rmsprop and etc but it just does not work. Not sure what's gone wrong.
@nfkok, ok thanks for the update. I'll look into setting up a training environment, and if that imporves I'll let you know.
Strange that your network doesn't work though. It sounds like it should be working well.
Hello, @nfkok I'm working on something very similar and currently trying to map keyboard inputs to the action space to control one of the paddles in the game. I was wondering how u went about it and also were you successful in playing against the trained AI?
Hello, @nfkok I'm working on something very similar and currently trying to map keyboard inputs to the action space to control one of the paddles in the game. I was wondering how u went about it and also were you successful in playing against the trained AI?
Hi. In the end I did not use the gym or gym-retro environments. Instead, I wrote my own Pong game using Pygame and train it using the policy gradient framework by Andrej Karphaty (by replacing the gym environment with my own Pong game, with same pixels number (80x80) ). I make each episode to have 11 games, and the computer player will follow the y-coordinate of the ball at 75% chances. The training is not too ideal, but the RL agent still manages to reach running mean of -5.5 (winning 5.5 games per episode) after a week of training at learning rate of 1.5e-3.
I tried to replace the computer player with the Pygame keyboard input and it is playable. I win most of the times, but the agent is not too bad either.
As for now I am still trying to train with different parameters and game conditions to try to improve it.
Although gym retro supports multi agent, something seems not quite right. My agent still lost every game and tends to stay still in the bottom after over 20,000 episodes of training (from several attempts which has cost me few weeks :P). I tried to check pixel by pixel and the reward mechanism but still could not find where is the problem. So in the end I decided to write my own Pong game.
Thank you for the info, and it seems strange that even after 20,000 episodes the agent isn't playing optimally.
Would it be possible to get some details on the keyboard input, I'm a bit confused as to how to implement it. I've tried using pygame but when I change the action space of the human player(action space shared by both player) by reading in the keyboard input and changing the specific index to a 1, my paddle either goes down really fast(or up). There's no incremental change in position, the paddle shoots off. Maybe i'm updating the action_space at the wrong time - I'm unsure.
Thank you
Most helpful comment
The atari gym environments may not have the multiplayer setup. The gym retro environments should have it though: https://github.com/openai/retro When you instantiate the
RetroEnvinstance you can specify the number of players: https://retro.readthedocs.io/en/latest/python.html#retro.RetroEnvThe action space can be a little confusing, you'll have to figure out how it maps which keys to which players, but it should be doable.
In general though, this is a sort of per-environment capability. Most environments are not 2 player, and 2 player environments may or may not support a single player mode or playing against humans.