Leela-zero: Some other "Zero" projects in Python (Keras)

Created on 23 Nov 2017  路  3Comments  路  Source: leela-zero/leela-zero

Some related (Zero way) projects in Python:

Mokemokechicken did an wonderful adaptation of this methodology in the game of Reversi: https://github.com/mokemokechicken/reversi-alpha-zero

His results were gorgeous and he got from scratch a superhuman (and super-classic AI) model.
I've done an adaptation of Mokemokechicken's code to apply it to the game of Connect4: https://github.com/Zeta36/connect4-alpha-zero

Just with a CPU my model was able to play in a few hours an almost perfect game defeating all the online Connect4 games I've found in Internet.

I did also an adaptation to chess: https://github.com/Zeta36/chess-alpha-zero, but unfortunately I don't have GPU to train this more complex game. Anyway the code is there and it's functional. If somebody has a bored GPU It'd great to know if the chess adaptation is able to learn to play at least as a good amateur (It'd take probably at least a week even with a powerful GPU).

I'd like finally to point out the easy way in which the idea behind AlpgaGo Zero can be applied to a lot of other situations almost just by changing the environment model (state, action, reward).

I hope you like this projects.

Most helpful comment

First of all - wonderful projects. Really impressive that the idea can work in such a general way out of the box.
However, to prevent over hyping over the thought of a superhuman level agent trained without massively strong hardware getting to these results so fast, I have to note the following:

@mokemokechicken 's model didn't reach super classic AI or even super human level yet. In fact, Even now, @mokemokechicken 's best model is struggling against a low level of a relatively mediocre program.

However, the very fact that seemingly substantial growth has been made without a distributed learning environment is definitely impressive.

Excellent and impressive work :)

All 3 comments

First of all - wonderful projects. Really impressive that the idea can work in such a general way out of the box.
However, to prevent over hyping over the thought of a superhuman level agent trained without massively strong hardware getting to these results so fast, I have to note the following:

@mokemokechicken 's model didn't reach super classic AI or even super human level yet. In fact, Even now, @mokemokechicken 's best model is struggling against a low level of a relatively mediocre program.

However, the very fact that seemingly substantial growth has been made without a distributed learning environment is definitely impressive.

Excellent and impressive work :)

I have a version for Tak https://github.com/GeneralZero/TakZeropy. I have about 4K games but takes forever to train on.

You are right @grolich. In fact, I've added a distributed option in the code so we could make use of multiple machines working at the same time.

Also, I've just added a pre-training process using supervised learning games (with PGN file games) so we can help to the policy in the beginning before starting the self-play improvement. This is similar to what AlphaGo did in its original version.

Regards.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

syys96 picture syys96  路  4Comments

l1t1 picture l1t1  路  4Comments

l1t1 picture l1t1  路  3Comments

Ttl picture Ttl  路  4Comments

wensdong picture wensdong  路  3Comments