Re: [math-fun] AI training

23 Jun 2020

...
LCZero then changes strategy itself to take advantage of Stockfish's inferred weakness.
It would be equally plausible tell the story that lc0 sees stockfish
as moving into a what stockfish perceives as a stronger position.
After all, that's what its supposed to do.

I remember a review of a game, deepmind vs stockfish I think, where
a locked position was reached, and then in the course of 10 or so moves
perceived as "shuffling" by the reviewer, the locked position collapsed
into a clear advantage.  I think this is a better explanation - making
an exchange that's 1% in your favor 10 times.  The evaluation of those
1%s is key.

There is a potential hazard with any training regimen, of training
into a zone that doesn't represent the broader reality.

Dave Dyer

tags

participants (1)