Re: [math-fun] can't win by resigning
Now that Lee Sedol has won game 4 of the 5-game match (having lost the match yesterday with the third game), it looks like AlphaGo does indeed need some fine-tuning in the end-game. When it's behind at the end it seems to try cheap tricks that might work on kyu-level players, and make forcing moves that delay the inevitable even while losing a few more points. It eventually resigned - I don't know whether it's wired for that or whether its trainers threw its virtual towel into the ring. When it's ahead it makes what the commentator called "slack" moves - safe moves that take the pressure off the opponent. When I ran my chess program in several human tournaments in Pittsburgh (I still have its USCF membership cards for a couple of years in the early 70's-- that's before they stopped accepting computers) it would thrash around when it saw an inevitable mate coming, sacrificing material just to push the mate out another move. Looked bizarre. I resigned for it several times. Oh, the embarrassment. But the bug didn't seem worth fixing. AlphaGo's play is interestingly different from humans in another way: its evaluation ignores the amount of the win in order to maximize the probability of a win. This led to the third game, for example, seeming closer than it was - AlphaGo apparently had other aces up its conduits if it had run into trouble. Many players will attack anything attackable just to finish off the game. Bobby Fischer wasn't satisfied with winning his chess games - he felt he had to crush his opponents.
From: Gareth McCaughan <gareth.mccaughan@pobox.com>
On 10/03/2016 17:19, Warren D Smith wrote:
Well, if software testers refused to try to find bugs due to "etiquette" then there would be a lot more bugs.
My guess as to the actual question here: once the game gets near the end, I suspect AlphaGo's tree search will produce extremely strong play even if its neural networks mess up. So even conditional on the scenario you describe where AlphaGo's training hasn't equipped it to evaluate things well in unusual positions, I think it's very unlikely that playing on would have given Lee Sedol a non-negligible extra chance of winning.
-- Jim Gillogly
It eventually resigned - I don't know whether it's wired for that
... yes it is. Commentator on live YouTube coverage said that when AlphaGo estimates that its chances to win drop below 10%, it resignes. Best, É.
Le 13 mars 2016 à 10:16, Jim Gillogly <scryer@gmail.com> a écrit :
It eventually resigned - I don't know whether it's wired for that
Is it known whether Deep Mind had or was trained on a database of Lee Sedol's previous games? (If so, that seems kind of unfair, since he didn't have the same privilege.) —Dan
On Mar 13, 2016, at 1:15 AM, Jim Gillogly <scryer@gmail.com> wrote:
Now that Lee Sedol has won game 4 of the 5-game match (having lost the match yesterday with the third game), it looks like AlphaGo does indeed need some fine-tuning in the end-game. When it's behind at the end it seems to try cheap tricks that might work on kyu-level players, and make forcing moves that delay the inevitable even while losing a few more points. It eventually resigned - I don't know whether it's wired for that or whether its trainers threw its virtual towel into the ring. When it's ahead it makes what the commentator called "slack" moves - safe moves that take the pressure off the opponent.
When I ran my chess program in several human tournaments in Pittsburgh (I still have its USCF membership cards for a couple of years in the early 70's-- that's before they stopped accepting computers) it would thrash around when it saw an inevitable mate coming, sacrificing material just to push the mate out another move. Looked bizarre. I resigned for it several times. Oh, the embarrassment. But the bug didn't seem worth fixing.
AlphaGo's play is interestingly different from humans in another way: its evaluation ignores the amount of the win in order to maximize the probability of a win. This led to the third game, for example, seeming closer than it was - AlphaGo apparently had other aces up its conduits if it had run into trouble. Many players will attack anything attackable just to finish off the game. Bobby Fischer wasn't satisfied with winning his chess games - he felt he had to crush his opponents.
From: Gareth McCaughan <gareth.mccaughan@pobox.com>
On 10/03/2016 17:19, Warren D Smith wrote:
Well, if software testers refused to try to find bugs due to "etiquette" then there would be a lot more bugs.
My guess as to the actual question here: once the game gets near the end, I suspect AlphaGo's tree search will produce extremely strong play even if its neural networks mess up. So even conditional on the scenario you describe where AlphaGo's training hasn't equipped it to evaluate things well in unusual positions, I think it's very unlikely that playing on would have given Lee Sedol a non-negligible extra chance of winning.
-- Jim Gillogly _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
On 2016-03-13 03:17, Dan Asimov wrote:
Is it known whether Deep Mind had or was trained on a database of Lee Sedol's previous games?
(If so, that seems kind of unfair, since he didn't have the same privilege.)
Deep Mind was trained on amateur games from an Internet go server (and then played "millions" of games against itself). It had seen no games of Lee Sedol. A reporter at the press conference asked exactly your question (including the question about the unfairness of the information mismatch, if it existed).
—Dan
On Mar 13, 2016, at 1:15 AM, Jim Gillogly <scryer@gmail.com> wrote:
Now that Lee Sedol has won game 4 of the 5-game match (having lost the match yesterday with the third game), it looks like AlphaGo does indeed need some fine-tuning in the end-game. When it's behind at the end it seems to try cheap tricks that might work on kyu-level players, and make forcing moves that delay the inevitable even while losing a few more points. It eventually resigned - I don't know whether it's wired for that or whether its trainers threw its virtual towel into the ring. When it's ahead it makes what the commentator called "slack" moves - safe moves that take the pressure off the opponent.
When I ran my chess program in several human tournaments in Pittsburgh (I still have its USCF membership cards for a couple of years in the early 70's-- that's before they stopped accepting computers) it would thrash around when it saw an inevitable mate coming, sacrificing material just to push the mate out another move. Looked bizarre. I resigned for it several times. Oh, the embarrassment. But the bug didn't seem worth fixing.
AlphaGo's play is interestingly different from humans in another way: its evaluation ignores the amount of the win in order to maximize the probability of a win. This led to the third game, for example, seeming closer than it was - AlphaGo apparently had other aces up its conduits if it had run into trouble. Many players will attack anything attackable just to finish off the game. Bobby Fischer wasn't satisfied with winning his chess games - he felt he had to crush his opponents.
From: Gareth McCaughan <gareth.mccaughan@pobox.com>
On 10/03/2016 17:19, Warren D Smith wrote:
Well, if software testers refused to try to find bugs due to "etiquette" then there would be a lot more bugs.
My guess as to the actual question here: once the game gets near the end, I suspect AlphaGo's tree search will produce extremely strong play even if its neural networks mess up. So even conditional on the scenario you describe where AlphaGo's training hasn't equipped it to evaluate things well in unusual positions, I think it's very unlikely that playing on would have given Lee Sedol a non-negligible extra chance of winning.
-- Jim Gillogly _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
"unfairness"
... is a strange concept here. How could it be defined? What about turning AlphaGo during the night into a "prudent" or "weak" mode (I see two purposes, one being "polite" towards Seoul, South Corea and Asia in general -- DeepMind officials, in the press conf after game 3 were almost begging for mercy, apologizing for the victory, telling every 3 sentences how Lee Sedol was "fantastic" -- and the second reason for research purposes (how does AlphaGo compute and "behave" IRL against a human champion and time pressure. "Unfairness", here, with the huge amounts (of money, reputation...) at stake is the least word I would use. à+ É. Catapulté de mon aPhone
Le 13 mars 2016 à 14:07, Michael Greenwald <mbgreen@seas.upenn.edu> a écrit :
On 2016-03-13 03:17, Dan Asimov wrote: Is it known whether Deep Mind had or was trained on a database of Lee Sedol's previous games?
(If so, that seems kind of unfair, since he didn't have the same privilege.)
Deep Mind was trained on amateur games from an Internet go server (and then played "millions" of games against itself). It had seen no games of Lee Sedol. A reporter at the press conference asked exactly your question (including the question about the unfairness of the information mismatch, if it existed).
—Dan
On Mar 13, 2016, at 1:15 AM, Jim Gillogly <scryer@gmail.com> wrote:
Now that Lee Sedol has won game 4 of the 5-game match (having lost the match yesterday with the third game), it looks like AlphaGo does indeed need some fine-tuning in the end-game. When it's behind at the end it seems to try cheap tricks that might work on kyu-level players, and make forcing moves that delay the inevitable even while losing a few more points. It eventually resigned - I don't know whether it's wired for that or whether its trainers threw its virtual towel into the ring. When it's ahead it makes what the commentator called "slack" moves - safe moves that take the pressure off the opponent.
When I ran my chess program in several human tournaments in Pittsburgh (I still have its USCF membership cards for a couple of years in the early 70's-- that's before they stopped accepting computers) it would thrash around when it saw an inevitable mate coming, sacrificing material just to push the mate out another move. Looked bizarre. I resigned for it several times. Oh, the embarrassment. But the bug didn't seem worth fixing.
AlphaGo's play is interestingly different from humans in another way: its evaluation ignores the amount of the win in order to maximize the probability of a win. This led to the third game, for example, seeming closer than it was - AlphaGo apparently had other aces up its conduits if it had run into trouble. Many players will attack anything attackable just to finish off the game. Bobby Fischer wasn't satisfied with winning his chess games - he felt he had to crush his opponents.
From: Gareth McCaughan <gareth.mccaughan@pobox.com>
On 10/03/2016 17:19, Warren D Smith wrote:
Well, if software testers refused to try to find bugs due to "etiquette" then there would be a lot more bugs.
My guess as to the actual question here: once the game gets near the end, I suspect AlphaGo's tree search will produce extremely strong play even if its neural networks mess up. So even conditional on the scenario you describe where AlphaGo's training hasn't equipped it to evaluate things well in unusual positions, I think it's very unlikely that playing on would have given Lee Sedol a non-negligible extra chance of winning.
-- Jim Gillogly _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
participants (4)
-
Dan Asimov -
Eric Angelini -
Jim Gillogly -
Michael Greenwald