[This article was first published on Econometrics by Simulation, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# As a homage to Yitang Zhang who has proven a mind-bending property of Prime Pairs, I have written a prime Sieve to detect all of the prime numbers from 1 to N.

# There might very well be a function in the base package that already does this.  No doubt there are a dozen math packages out there which does this.  However, it is the first time I have programmed a Prime Sieve :)

# A prime sieve is a simple algorithm which grabs the first number after 1 and eliminates all numbers devisible by it.  Then it grabs the next number in the set remaining and does the same for that.

primes = function(n=1000, printProgress=F) {
# 1 is always in the list
prime = 1
# The availabe set we look at as greater than 1 up to n
set = 2:n
# Loop through the set dropping anything which is not a prime and the primes as we get to them as well
while (length(set)>0) {
# Add the first number we encounter to our prime list
prime = c(prime, set[1])
#
set = set[floor(set/set[1])!=set/set[1]]
if (printProgress) print(paste("Elements Remaining: ",length(set)))
}
return(prime)
}

# This works pretty fast.
primes1k = primes(printProgress=T)
# R finds the primes of the first 1,000 integers takes a little longer

# See it mapped out
require(ggplot2)
qplot(primes1k) + geom_histogram(aes(fill = ..count..))

# To look at this idea of prime pairs lets, look at the
primes100k = primes(10^5, printProgress=T)
qplot(primes100k) + geom_histogram(aes(fill = ..count..))
# Finding the primes of the first 100,000 numbers takes much longer

# There is a bit of a stretch near the beginning in which there is between 40,000 and 70,000 elements left in the remaining set in which identification of primes does not eliminate any more than a few elements from the set.  After 40,000 elements things start speeding up because the list gets shorter and is able to be scanned faster.
primes1m = primes(10^6, printProgress=T)
qplot(primes1m) + geom_histogram(aes(fill = ..count..))

length(primes1m)
# This identifies 78,499 prime numbers between 1 and 1 million.

10^6/length(primes1m)
# If primes were distributed evenly they would be on average 12.7 numbers apart.

# Yitan Zhang proves the astonishing fact that there are in infinite number of primes no farther than the distance of 70 million.  This is true even when pairs of primes might be a great deal further than that from other pairs.

# There are no known uses of this theory.  However, once again a mathematician has proved something fundamental about numbers which might aid humananity in the distant future.  Currently Fermat's little theorem is widely used as the basis for modern cryptography.  Perhaps, Yitan Zhang's will be the basis for equally important work in the future.
Syntax Highlighting by Pretty R at inside-R.org