[math-fun] PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation

20 Mar 2016

      ...
From: Marc LeBrun <mlb@well.com>
To: math-fun <math-fun@mailman.xmission.com>
Subject: Re: [math-fun] PCG: A Family of Simple Fast Space-Efficient
  Statistically Good Algorithms for Random Number Generation.
Message-ID: <D3141CB8.1DCD3%mlb@well.com>
Content-Type: text/plain;	charset="ISO-8859-1"
...
="Joerg Arndt" <arndt@jjj.de>
As some of us are interested in random number generation,
I'd like to point to  https://www.cs.hmc.edu/~oneill/
I'll unlike find time to read the paper (preprint as
of now), so comments are welcome.
Comments?  Heh, be careful what you wish for!  I'll try to be brief, but
this provokes a pet peeve (please delete if unfun)...
I skimmed the paper, and at risk of missing something of value, I'll also
pass on chewing deeper into all that verbal excelsior because I think it's
ultimately a huge waste of time.
My opinion is that academics have created a tiny but evergreen cottage
industry around PRNG design for decades by presuming the problems they are
addressing are actually interesting.
One of the biggest outdated misapprehensions is that, while it may have
been
vital to minimize internal state in the von Neumann era, today it's just
quaint.  After progressive decades of "Moore's Law", statements like the
following from the paper sound like outtakes from a "Downton Abbey"
episode:
"If the entire state of a random number generator can be represented in a
single processor register, we may reasonably expect it to offer performance
advantages compared to a generator that requires hundreds or thousands of
bytes of memory to represent its state."
Confining attention to algorithms operating on a single register is like
insisting on programming while peeping through a keyhole.  A nice circus
stunt if one can pull it off perhaps, but of little real relevance today.
As much as I love Don Knuth, I think he, with the best of intentions, did
the art a disservice by obsessing at such length in TAOCP over the Linear
Congruential Generator.  I suspect in part this was because there was such
a
sunk investment in all that pretty analysis.  I regret that a consequence
has been a pernicious misplacement of priorities, resulting in blizzards of
this kind of misdirected academic snow ever since.  But what's done is
done.
Ironically, TAOCP actually briefly treats some alternative approaches with
much longer legs.  Here's an example I've used, thanks to a suggestion by
Bill Gosper:
======== implementation sketch ========
Take the structure of a classic 1965 "Tausworthe" (circular-shift-with-XOR)
bit generator, expand each of the BITS to an entire WORD, and add (rather
than XOR) their values to get the next output word.  So a classic 64-BIT
(say) shift-register generator expands into a 64-WORD generator (now with
4096 bits of state).
This is incredibly trivial to implement and very fast: a circular buffer of
64 "registers" (that will be cached in fast memory of course), a handful of
auto-incrementing pointers and some additions.  Not even multiplies or
mods.
Now look at the LSB of the outputs.  This will be the sequence of bits the
Tausworthe generator produced.  But furthermore, the higher order bits will
be likewise, except that they will be perturbed by the cascade of "noise"
carries coming out of the lower-order bits.
This has a huge state space and hence period (if you're obsessed with that)
and I expect is more than plenty good for almost any practical purpose.
However if you're worried about an adversary that's reading the output
stream and trying to invert it, then you can mess up the rigid time
correlations by adding a scrambling post-process (also essentially
described
in Knuth) as follows:
Take the upstream generator, call it R1, a second table T2 (likewise
initialized with noise) and some other generator R2 (actually R1 and R2
could probably just be alternating outputs of the wide-Tauseworthe thingy).
Output T2[R2}, then increment T2[R2] += R1.
Now T2 is accumulating a turbulent rain of the noise from R1, mixed by R2.
======== end of implementation ========
This thing is cheap to implement, very fast and easy to understand and
test.
But, also probably less fun for academic analysis (due to being both too
easy and too hard).
And of course 2048 bits, or even a megabit, is a pittance nowadays.
Think it'd work?  Maybe we can stop churning out papers about new PRNGs?
PS: for an interesting alternative for fast clean noise, but with different
design criteria (basically NO state update, F[n] vs. F[F[...), see
John K. Salmon, Mark A. Moraes, Ron O. Dror, and David E. Shaw, "Parallel
Random Numbers: As Easy as 1, 2, 3,"?Proceedings of the International
Conference for High Performance Computing, Networking, Storage and Analysis
(SC11), New York, NY: ACM, 2011.
--I disagree with LeBrun.
First, if your PRNG consumes a megabit, its performance is going to suck
because of cache misses.

Second, the idea LeBrun mentions was discussed by Knuth TAOCP
as Richard Brent's idea and is one instance of what is
called "lagged fibonacci" generator.
As Brent realized the LS bit is then a max-period sequence (if
parameters chosen right).
However, the Brent generator in my opinion sucks because using an array
of N words each W wide, we do not get a period of order 2^(N*W) which
is what we want; we get a period more like order 2^(N+W), and
the high bits are more random than the low bits in each word.
It also is known to fail various tests in the BigCrush battery.

LeBrun's next idea is basically a scrambler similar to one Knuth mentions,
applied to output of his lagged fibo.   That also sucks cachewise.

Other kinds of lagged fibo generators can do better, however.

Third, I agree with LeBrun the academics have gone a bit nuts,
and often done badly, but there is some rationale:
The fact is, no generator I fully approve of has yet been constructed.
That I know of. So make one.

I agree we should be annoyed.  But not because this area is being worked on.
We should be annoyed because the work has not been good enough.

And Knuth's own suggestions also were quite poor, and usually fail BigCrush.

Warren D Smith

tags

participants (1)