So you probably won't be shocked to know that the most common ratio is 1:1; this happens 584,970 times. But you might be surprised to know the *second* most common ratio! It is 3301/3299, and it occurs *2476* times. I see a total of 26,044 distinct ratios. Huffman encoding them will probably work fairly well but the dictionary itself will be somewhat large. The best I've seen so far is to simply print -n if the last prime is p_n and the number for g(n) is just g(n-p_n)*p_n; this happens 408,967 times out of the 415,031 unique values for g(n). For the rest, for now I just print the entire factorization. This compresses down (with gzip) to 1.155M. But I think we can do even better; I notice what's actually going on for the most part is patterns of 0's and 1's shifting at the end of numbers. These correspond to ratios such as 3301/3299 (essentially, this means take the rightmost prime and move it "up" one). I also see: take the leftmost 0 (unused prime) and move it "down" one (so p_i^1 p_{i+1}^0 internally becomes p_i^0 p_{i+1}^1). I definitely see repeating patterns both on the left (prefixes of values > 1) and on the right (zeros and ones). I bet I can get this down to less than a half byte per value . . . On Wed, Nov 16, 2016 at 5:13 PM, Tomas Rokicki <rokicki@gmail.com> wrote:
Awesome idea! Let me rip through this. I'm currently working with a factored representation so this is pretty easy to answer, I believe.
On Wed, Nov 16, 2016 at 5:07 PM, Allan Wechsler <acwacw@gmail.com> wrote:
Tom's musing about how much a table of Landau's function could be compressed inspired me to scribble out the first quotients, that is, the ratios between successive Landau values.
It seems like there aren't that many. Up through g(25), I see 1 (9 times), 2 (5 times), 3/2 (5 times), 4/3 (3 times), 5/4 (2 times), and 7/5 (1 time). That's only six values. How many ratios actually occur up through 1M? If it's smaller than 256, then we could encode each difference in two bytes; that would take us down to the neighborhood of 2 MB right there. And the ratios have very different rates of occurrence: using a Huffman encoding would probably push us past Tom's 1.89MB. In my hand-worked example up through g(25), 1, 2, and 3/2 would be coded with 2 bits, 4/3 with 3, and 5/4 and 7/5 with 4 bits. The body of the table, therefore, would be 59 bits long, a bit over 2 bits per entry on average. (Of course I haven't included the cost of conversion table, the one that says 00 -> 1, 01 -> 2, 10 -> 3/2, 110 -> 4/3, and so on. But for larger n this would be a tiny part of the content.)
Tom, how hard would it be to repeat the arithmetic I just did on the million-entry table?
On Wed, Nov 16, 2016 at 6:35 PM, Tomas Rokicki <rokicki@gmail.com> wrote:
Representing each result by a row of the form
7 4 3 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 0 1
(for instance) meaning 2^3 3^4 5^3 7^2 11^2 ...
and squeezing out duplicate rows yields something which gzip eats for lunch.
The entire result set through 1M compresses down to 3.0MB.
Bzip2 takes that down to 1.89MB.
Giving just the ratio (expressed in this way for the positive factors, and as negative values for the negative factors) to the primordial of the highest prime will probably be even more effective. So the above row might be written
19 6 3 2 1 1 1 1 1 -18
which means the 19th primordial, times 2^6 * 3^3 ..., divided by the 18th prime.
On Wed, Nov 16, 2016 at 3:25 PM, <rcs@xmission.com> wrote:
If the only concern is file compression (ignoring the interesting math question), then a single bit/row meaning "this row is compressed using the PrimePowerConjecture" or "this row is not compressed" would collect most of the savings.
Rich
----------
Quoting Joerg Arndt <arndt@jjj.de>:
* Joerg Arndt <arndt@jjj.de> [Nov 16. 2016 16:02]:
About "file gets too big": Is it true that every entry V[n] for n>=2 is of the form V[n] = V[j-pp]*pp with pp a prime power? If so, we could just store pp, making the file much smaller.
Nope, at least assuming that pp must be a divisor of n (wrong for n = 29 and many larger n).
It seem possible to replace every entry with a reference to some
entry
before it and the ratio both have (the ratios do seem always prime). This way the file produced is significantly smaller.
The (slight) drawback is that one must pick a (small) selection of values in the table, multiplying the ratios. The selection is just by following the references.
This could also give an interesting graph (tree).
Best regards, jj
[...]
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- -- http://cube20.org/ -- [ <http://golly.sf.net/>Golly link suppressed; ask me why] -- _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- -- http://cube20.org/ -- [ <http://golly.sf.net/>Golly link suppressed; ask me why] --
-- -- http://cube20.org/ -- [ <http://golly.sf.net/>Golly link suppressed; ask me why] --