23 Dec
2017
23 Dec
'17
1:45 a.m.
* rcs@xmission.com <rcs@xmission.com> [Dec 23. 2017 09:21]:
This is mostly for the old-timers.
[...]
As division has cheapened, 1MR has mostly fallen out of use.
Writing L for latency (in cycles) and T for throughput (in instructions/cycle) we roughly have add/sub L=1 T=3(!) mul L=4 T=1/2 div L=60 T=1/60 for both floating point and integer operands. So saving divisions is very much worth it. Usual suspect is when vectors are normalized: double N = V.norm(); for (unsigned j=0; j<len; ++j) V[j] /= N; should better be double N1 = 1.0 / V.norm(); for (unsigned j=0; j<len; ++j) V[j] *= N1; The compiler is not allowed to do this (unless there is some --ariane5 optimization option). Best regards, jj
[...]