I read a nice blog post this morning about implementing the Sieve of Eratosthenes in Lisp. It got me thinking about something that I did a few weeks back.
A few weeks back, I needed to generate all of the primes up to a given number. The number was big enough that I didn’t want to keep an array of bytes (or even bits) big enough to do the sieving. I just had an aversion to keeping around O() memory when O() would suffice. I also had a hankering (probably because of Let Over Lambda) to implement it as a generator—a function that each time you called it gave you the next prime.
My first version amounted to trial-division against the entire list of primes generated so far. For the number of primes that I wanted to generate, this took more than nine minutes. Unacceptable.
I made some tweaks to it to ensure that I was only checking against the previously generated primes that were less than or equal to (isqrt n)
. I made some other tweaks to keep my previously generated primes sorted so the most-likely to be a divisor of were first. I made another tweak to accumulate the first primes into a product so that I could check all of those against with a (gcd n product-of-first-k)
where was constrained so that the product stayed below most-positive-fixnum
.
Those tweaks got me down to around two minutes. That was still more than twice as long as I wanted it to take.
Now, with the help of CL-HEAP and the Rosetta-Code-approved optimization mentioned in that blog post, I have it down to thirty seconds with no trial division, no square roots, and only one multiplication per prime.
I keep a heap of cons
cells. The car
is the next number that will be crossed out by this prime and the cdr
is how much to increment the car
once I’ve crossed out the car
.
Ignoring the special-case at the beginning for two, my current algorithm looks like this:
(let ((nixed (make-instance 'cl-heap:fibonacci-heap
:key #'first
:sort-fun #'<))
(guess 1))
(lambda ()
(loop :do (incf guess 2) :until (%prime-p guess nixed))
(cl-heap:add-to-heap nixed (cons (* guess guess) (* guess 2)))
guess)))
The %prime-p
function pulls cons
cells off of the heap until it finds whose car
isn’t equal to the current guess. For each of them where the car
was equal, it increments the car
by its cdr
. Then, it puts all of the cons
cells it removed back on the heap. The guess was prime if it wasn’t equal to any of the numbers on the heap.
(let ((prime t)
(n nil))
(loop :do (setf n (cl-heap:pop-heap nixed))
:while (and n (= (car n) guess))
:do (progn
(incf (car n) (cdr n))
(cl-heap:add-to-heap nixed n)
(setf prime nil))
:finally (when n (cl-heap:add-to-heap nixed n)))
prime))
I could probably save some time and cons
-ing by peeking at the top of the heap rather than always gathering or countless other ways, but this more than exceeded my expectations for today.
It sounds like you’re looking for a prime number sieve with the property of incrementality, such as that provided by Bengalloun; but rather than seeking out that paper, I would instead recommend Pritchard’s 1994 “Improved Incremental Prime Number Sieves” (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.52.835), which also incorporates Pritchard’s other such nifty ideas as converting sieves to ones computed strictly with addition operators or using symmetrical “wheel” structures to dramatically reduce space and time complexity.
Less dense than Pritchard but possibly more fun is the Sorenson, Dunten, and Jones classic “A Space Efficient Fast Prime Number Sieve” (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.3936), and also possibly Sorenson’s “Trading Space for Time in Prime Number Sieves” (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.9487).
Rumour has it that D. J. Bernstein’s (yes, djb of qmail fame) quadratic sieve is the “fastest” mechanism out there, but I have neither been able to pound my way through the math to a working implementation, nor found anyone who has done so, nor figured out whether “fastest” means asymptotically or practically.
Thank you for all of these references. I will definitely check them out. I had lots of references on finding probable primes and proving them prime, but that didn’t feel wholly useful in an incremental context.
I have read Dan Bernstein’s stuff before but didn’t remember its time or space complexity.
After the two minute attempt, I tried a basic wheel-based sieve. My code size blossomed, but wasn’t nmeasurably faster to a million than the two-minute code.
I will check out the references. Thanks again.
Oi. Some update of WordPress totally scrogged my comment area CSS. This looks awful.
If you want to go to the other extreme, you should try implementing “Conway’s Prime Producing Machine”: http://www.jstor.org/stable/2690263
I implemented it myself a few weeks ago in C. It started running into overflow problems once it got up to 11 (or maybe 13)!
Wow. I hadn’t seen that before (at at least not in recent memory). Who does Conway’s illustrations? I’ve seen lots of his illustrations and hand-drawn charts and they all have that same feel. Does he draw them himself? My Google-fu isn’t up to that.
I’m guessing Richard K. Guy does, since he wrote the paper and no one else is credited for the drawings. But I don’t know for sure.
I need to get back to finishing my Conway’s Prime Producing Machine implementation. Right now I’m in the middle of implementing the BaileyâBorweinâPlouffe formula.