Unbounded sieve of Eratosthenes

The normal sieve of Eratosthenes enumerates primes by taking a list of natural numbers and striking off from it each multiple of each prime as the primes are found. The primes are the numbers which haven’t been stricken off the list yet.

One downside to the normal sieve is that you have to decide up front the upper bound on your primes is going to be: you have to make that list to strike numbers off.

However, with the right data structure, it’s possible to make an unbounded sieve. I like to think of it as a long Mancala board with stones colored by the primes in the cups. You look at each cup one at a time in increasing order. When a cup contains no stones, it corresponds to a prime number, and you create a new stone. Whenever you have a prime stone colored with p, you move it p cups forward. That’s the algorithm.

As a Python generator, you can represent the Mancala board as a dictionary whose keys are natural numbers and whose values are arrays containing primes. In fact, once the algorithm gets to a particular number, the elements of the array are all of its prime factors (without multiplicity).

def primes() :
    counters = {}
    p = 2
    while True :
        while p in counters :
            for step in counters[p] :
                counters.setdefault(p+step, []).append(step)
            del counters[p]
            p += 1
        yield p
        counters[p] = [p]

Another way to understand this is as an inversion of the normal order of the loops in the sieve. The counters dictionary is keeping track of each strike-off loop currently in progress.

Of course, there are a few optimizations one can make to this, for instance only considering odd numbers, or perhaps only considering numbers which are divisible by neither two nor three.[1]


Thanks to Scott Kovach, here is a similarly short Haskell implementation (though with a O(log n) map):

import qualified Data.IntMap as M
import Data.List

primes :: [Int]
primes = step 2 M.empty
  where step :: Int -> M.IntMap [Int] -> [Int]
        step n m | Just vs <- M.lookup n m =
          step (n+1) $ foldl' (\m v -> M.insertWith (++) (n+v) [v] m) m vs
        step p m = p : step p (M.insert p [p] m)


[1] This also has the optimization that, if a number is composite and between p and p2, then it is divisible by a second prime as well.
def primes() :
    counters = {}
    yield 2
    p = 3
    while True :
        while p in counters :
            for step in counters[p] :
                counters.setdefault(p+2*step, []).append(step)
            del counters[p]
            p += 2
        yield p
        counters[p*p] = [p]
        p += 2