From: rcs@xmission.com
Intel is introducing the PCLMULQDQ instruction, which does a 64x64->128 bit Xor-Multiplication.
On the first Google hit: http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-u... the definition has some typos. Fixing but improvising notation: Multiplying a[n-1] thru a[0] by b[n-1] thru b[0] for 0 <= i <= n-1: c[i] = xor( [ a[j] b[i-j] for 0 <= j <= i ] ) for n <= i <= 2n-1: c[i] = xor( [ a[j] b[i-j] for i-n+1 <= j <= n-1 ] ) Note that when i = 2n-1, i-n+1 = n, and there is no j: n <= j <= n-1, but the identity element of xor is 0, so c[2n-1] = 0 I find it a little strange to say that, e.g., "GCM requires the carry-less multiplication of 128-bit operands, producing a 256-bit product." As opposed to a 255-bit product with a zero on top. A picture to confirm the intuition would have helped: a3 a2 a1 a0 PCLMULQUQ b3 b2 b1 b0 ----------------------------- a3b0 a2b0 a1b0 a0b0 a3b1 a2b1 a1b1 a0b1 a3b2 a2b2 a1b2 a0b2 (+) a3b3 a2b3 a1b3 a0b3 -------------------------------------- 0 c6 c5 c4 c3 c2 c1 c0
Knuth is *emphatically* calling for 8x8bit matrix multiply, both xor and ior. --rwg
There is a kind of associative memory that uses a hash where you rotate the rows instead of shifting, and the sum is a majority (actually half-or-more) operation. --Steve
participants (1)
-
Steve Witham