Hi, On 6/24/20 12:08, Steve Witham wrote:
Before going deep into this... What do you mean by "looks like it's getting stronger,"
I mean that the NN seems to be getting stronger as per the self-play training regime.
and what do you mean by "helps [them] win?"
I mean that, since only the NNs that win will survive the self-play training regime, the NNs that help themselves win will have an evolutionary advantage over those than don't. This is fascinating because it appears to imply that emergent collaboration beats individualism.
Helps *which of the two* identical players to win? Win against whom?
The "them" referred to the NNs that help themselves win in self-play. In self-play, you'd have the same network playing against itself. So, I meant that a network that helps the winning side win could get promoted preferentially over networks that obstinately refuse to lose.
How could they be changing things that outsiders couldn't see? What could they learn to do except continue to seem to be playing the game?
See the examples posted by others to this thread. It does happen.
Thompson's hack works because everyone trusts the compiler to referee its own development process. Then Thompson hacks the referee. After that, the hack part never learns. So you would need a situation where people have put an NN in charge of running its own training process. Even then it's harder than Thompson's hack, but I'm going to stop being evil now.
Hmmm, I think you could still have people in charge of the training process, as the posted examples show. The issue here is that the refereeing is done by performance, rather than by auditing whether the method is sound. And determining whether the method is actually sound is hard, because I hear that in general terms it's from very difficult to impossible to audit what an NN has actually learned.
There is a C compiler whose compiled code has been automatically proven to match the semantics of its source code. I don't know whether the prover was compiled with that very same C compiler.
Are you referring to the verified optimizing C compiler Airbus produced? Andres.