Re: [math-fun] Fixed points of Google translate ?

2 Mar 2013

      On Sat, Mar 2, 2013 at 11:19 AM, Mitchell <mitchell.v.riley@gmail.com>wrote:
...
The step from English to Japanese would
give the word both in English and Japanese, so the total length
doubled each round trip. The [[Bing translate]] isn't fooled by that any
more.
I often have to translate to and from Japanese myself. The large-number
name "Nayuta" (那由他 or 那由多, "10^60" originally from Sanskrit meaning "impossibly
large number") was imported from Chinese by Buddhist priests and was never
native even in China. On translationparty.com it quickly expands into the
somewhat humorous "All Satan and Devil Devil Devil-multiple solutions with
multiple solutions".

A feature (or common idiomatic construct) in Japanese that contributes to
word-doubling is that a word appearing once is a modifier, but used twice
it can mean an absolute thing.

For example, "almond" is アーモンド in Japanese, which is simply a phonetic
transliteration ("Ahh-mondo."). When I asked for "almonds" in a gift shop I
was directed to candy bars containing almonds. I repeated my request, "いいえ
アーモンド。アーモンドアーモンド。" (which I'm sure was complete nonsense, but crudely
translated I said: "NO almond. Almond Almond!".) I was then directed to the
almonds.
...
On 3 March 2013 00:29, Henry Baker <hbaker1@pipeline.com> wrote:
...
Let gt(x) be "Google translate" of some corpus x from some language D
into some language R.
Let gt^-1(y) be the "Google translate" of y in the language R back to
the language D.
Let rt(x) by the "round trip" translate of x in D to R and back to D.
What are the fixed points of rt(x) ?
Naturally we'd like to ask this question about arbitrary language-pairs,
thus the gt() function should have a subscript indicating the "from" and
"to" languages. My examples above involve gt_en_ja(x) and gt_ja_en(x). For
any i != j, the gt' function is just the gt function with the subscripts
reversed: gt'_j_i() = gt_i_j(), and rt(x) is gt_j_i(gt_i_j(x))
...
...
They obviously imply fixed points of gt(rt(x)).
There are many trivial fixed points in Google translate, which leaves
unrecognized words alone.

Slightly less trivially, western proper nouns are transliterated into
Katakana, and commercial names (companies, brands) often use the Latin
alphabet because it is appealing.  "Robert's almonds" finds a fixed point
very quickly because of Katakana transliteration.
...
[...]
Are there any "implosions", where rt^n(x) becomes empty?
...
Google never seems to return a blank string -- at the very least it will
just give you the same thing you put in, untranslated.
...
...
Are there cycles, such that rt^n(x) never converges, but
rt^(n+m)(x)=rt^n(x) for some m
and for all n>some p ?
It would naturally be fun to consider three-language cycles, such as
gt_en_fr(gt_de_en(gt_fr_de(x))) which I'm sure would have been of interest
to Hofstadter when he was working on Lewis Carroll's Jabberwocky.
...
...
For a given language, and a word w within that language, there must
exist at least one
comprehensible sentence containing that word w.  Determine the fixed
points for each
of these sentences.
Not for any word w, because there are instances of two words in language A
getting translated into the same word in language B. Perhaps I
misunderstand your assertion.

-- 
  Robert Munafo  --  mrob.com
  Follow me at: gplus.to/mrob - fb.com/mrob27 - twitter.com/mrob_27 -
mrob27.wordpress.com - youtube.com/user/mrob143 - rilybot.blogspot.com