On Sat, Mar 2, 2013 at 11:19 AM, Mitchell <mitchell.v.riley@gmail.com>wrote:
The step from English to Japanese would give the word both in English and Japanese, so the total length doubled each round trip. The [[Bing translate]] isn't fooled by that any more.
I often have to translate to and from Japanese myself. The large-number name "Nayuta" (那由他 or 那由多, "10^60" originally from Sanskrit meaning "impossibly large number") was imported from Chinese by Buddhist priests and was never native even in China. On translationparty.com it quickly expands into the somewhat humorous "All Satan and Devil Devil Devil-multiple solutions with multiple solutions". A feature (or common idiomatic construct) in Japanese that contributes to word-doubling is that a word appearing once is a modifier, but used twice it can mean an absolute thing. For example, "almond" is アーモンド in Japanese, which is simply a phonetic transliteration ("Ahh-mondo."). When I asked for "almonds" in a gift shop I was directed to candy bars containing almonds. I repeated my request, "いいえ アーモンド。アーモンドアーモンド。" (which I'm sure was complete nonsense, but crudely translated I said: "NO almond. Almond Almond!".) I was then directed to the almonds.
On 3 March 2013 00:29, Henry Baker <hbaker1@pipeline.com> wrote:
Let gt(x) be "Google translate" of some corpus x from some language D into some language R.
Let gt^-1(y) be the "Google translate" of y in the language R back to the language D.
Let rt(x) by the "round trip" translate of x in D to R and back to D.
What are the fixed points of rt(x) ?
Naturally we'd like to ask this question about arbitrary language-pairs, thus the gt() function should have a subscript indicating the "from" and "to" languages. My examples above involve gt_en_ja(x) and gt_ja_en(x). For any i != j, the gt' function is just the gt function with the subscripts reversed: gt'_j_i() = gt_i_j(), and rt(x) is gt_j_i(gt_i_j(x))
They obviously imply fixed points of gt(rt(x)).
There are many trivial fixed points in Google translate, which leaves unrecognized words alone. Slightly less trivially, western proper nouns are transliterated into Katakana, and commercial names (companies, brands) often use the Latin alphabet because it is appealing. "Robert's almonds" finds a fixed point very quickly because of Katakana transliteration.
[...]
Are there any "implosions", where rt^n(x) becomes empty?
Google never seems to return a blank string -- at the very least it will just give you the same thing you put in, untranslated.
Are there cycles, such that rt^n(x) never converges, but rt^(n+m)(x)=rt^n(x) for some m and for all n>some p ?
It would naturally be fun to consider three-language cycles, such as gt_en_fr(gt_de_en(gt_fr_de(x))) which I'm sure would have been of interest to Hofstadter when he was working on Lewis Carroll's Jabberwocky.
For a given language, and a word w within that language, there must exist at least one comprehensible sentence containing that word w. Determine the fixed points for each of these sentences.
Not for any word w, because there are instances of two words in language A getting translated into the same word in language B. Perhaps I misunderstand your assertion. -- Robert Munafo -- mrob.com Follow me at: gplus.to/mrob - fb.com/mrob27 - twitter.com/mrob_27 - mrob27.wordpress.com - youtube.com/user/mrob143 - rilybot.blogspot.com