[math-fun] Smith's SMODA factorer self-destructs
My factoring algorithm SMODA has self-destructed to the extent it now seems less attractive than the quadratic sieve as a factorer. ----More detailed explanation: The problem was embarrassingly stupid and would have been noticed as soon as an implementation was built but Andy JENNINGS saved time by spotting the problem before that. To review, SMODA depended on the first idea that, if there were a magic oracle whom you could ask "O Oracle, here is B and N, now please generate for me a set of B-smooth numbers near to N or near to multiples of N... such that these numbers are (i) otherwise pretty random, (ii) come approximately as close (in terms of additive error) to being a multiple of N as possible existencewise, (iii) come at a small cost, e.g. polynomial in logN cost, per number found", then factoring would be possible at speeds rivalling or exceeding the Number Field Sieve. The second idea was that instead of depending on magic, it was actually possible to build the oracle by combining a certain fairly small N-independent database with some simple greedy construction algorithms. The database was hard to find, hence still somewhat oracle-like, but it seemed possible to find it fast enough that challenging world records might be feasible. Now various potential problems were pointed out, which actually did not seem to be problems, such as 1) database too expensive to find? but actually just today my computer run is ending, and has constructed preliminary database up to 2^400 size. Jim White & I were building datasets of the "world's most extensive Stormer pair collection" and it seems clear it with feasible number of computers our current codes could reach 2^500 -- and likely a good deal further with as-yet-unimplemented algorithms. 2) Why not use LLL lattice reduction methods to construct database? As far as I can tell they are an uncompetitively poor idea. The real problem is this. The greedoid which uses the database to find smooth numbers (call them S) near to N or near to multiples of N, works in a manner somewhat analogous to the way binary numbers can be constructed lying arbitrarily near to any real. One keeps on adding smaller and smaller "correction vectors" in a certain vector space, and proves that a certain natural vector logarithmic distance measure keeps getting smaller, indeed at least a factor 2 smaller each whack. That works. BUT, the problem is, when you make that natural logarithmic distance smaller, that does NOT imply that the actual difference |S - multipleofN| keeps getting smaller back in the ordinary world of integer numbers, rather than the vector world. Instead, this difference gets smaller for a while, but once |S-N| is of order about squareroot(N), it almost always would start to grow again. Simple example: if I had, say, 10111 =approx= 10000 with error=111 and I used multiplication by the rational 101/100 to try to improve the approximation, I'd get 1011100 =approx= 1010000 with error=1100 which is the opposite of an improvement in the additive error (although it does improve the *relative* error). Improving the relative error is easy and was essentially what I was doing. Although in principle it basically is always possible to improve the additive error, (e.g. if we use 271/274 not 100/101 then get 2740081 which is 81 mod 10000, or if we use 991/1002 we get 10020001 which is 1 mod 10000) I don't see how SMODA can gaurantee that without an unfeasibly enormous database far larger than I had in mind. So the net effect is that SMODA should stop working approximately at the point where you try to exceed the performance of the quadratic sieve. And it probably will be worse than QS since it is more complicated. This is a pity because SMODA and associated theory and data was quite big and aesthetic looking, but now is seen to be a "castle in the air." Is there any hope to salvage SMODA? I'm not sure. You'd need to invent a way to use the database that would be far more powerful than the simple greedoids I had in mind. (But it would be permissible to consume a good deal more compute-time than the greedoid consumed.) One way to try do that would be to find a rational number (such as 271/274 in example above) that yields improvement (this is easy by Euclid alg.), but doing so in such a way its numerator was smooth (not so easy). For example based on the two good rationals above 271/264 and 991/1002, and 91/92 which is also good, we could look for smooth numbers 271a+991b+91c in various ways such as sieving or lattice techniques. Another salvage attempt would be to have stronger guarantees about the database than those I'd had in mind. For example, we believe based on both theoretical models and computer evidence that very small databases based on very small prime sets [of order only about (logN)^2 primes] should *exist* but I had given up on trying to construct databases that small because they asymptotically seemed much too hard to find. If, however, we nevertheless assumed SMODA had access to such a super-small super-good database, then SMODA might be able to become a lot more powerful due to "cancellations" arising from common prime factors re-occuring enough of the time. So, salvaging SMODA may or may not be possible. It is worth some thought. If anybody wants to have my SMODA parts I/II paper draft/shipwrecks and/or our the Smith+White datasets of Stormer pairs, let me know. Warren D. Smith http://rangevoting.org
participants (1)
-
Warren Smith