This algorithm might not be so bad for computing logs in parallel on a "GPU": Graphical Programming Unit, e.g., from AMD or nVidia. These GPU units have a lot of processing power, but relatively small amounts of memory; memory access latency might also cost many processing cycles. At 02:12 PM 10/18/2012, rcs@xmission.com wrote:
There was an old trick for calculating floating point log_2(x) back in assembly language days: Assume 1.0<=x<2.0. Square it 7 times, to convert mantissa bits (bits to the right of the binary point) into exponent bits. Decant the 7 exponent bits into the nascent logarithm; zero them out in the x^128 value, bringing it back into the 1.0 - 2.0 range. Rinse, repeat, till you've got 28 bits of logarithm. (The PDP6 had 36-bit words.)
A similar trick essentially reproduces SP's idea: Iterate: cos 2x = 2 (cos x)^2 - 1 Keeping track of the sign of cos(2^N x) will read out x/pi in binary. (I learned these from Gosper.)
Rich