Skip to main content

Number of Primes

Anderson's Theorem

(a) The number of primes in [1,n] is no more than 2+floor(n/2).

The probability of n being prime when n is not prime is 1/2 - see Dasgupta,Papadimitriou,Vazirani "Algorithms" page 26. Therefore, the E(pi(n)) is n/2.

(b) There does not exist another set of adjacent primes other than {1,2,3}

5: 2 + floor(5/2) = 2 + 2 = 4:=> {1,2,3,5} : 4 <= 4
7: 2 + floor(7/2) = 2 + 3 = 5 => {1,2,3,5,7} : 5 <= 5
11: 2 + floor(11/2) = 2 + 5 = 7 => {1,2,3,5,7,11} 6 <= 7
26: 2 + floor(26/2) = 15 => {1,2,3,5,7,11,13,17,19,23} : 10 <= 15

Lagrange's Theorem is Inaccurate

Lagrange's theorem about primes states that pi(x) is the number of primes <= x. The pi(x) is approximately x/ln(x). He postulated that the lim of pi(x)/(x/lnx) as x-> infinity was 1. This is incorrect. if the number of primes is bounded by n/2 then refactoring and reducing Lagrange's Theorem results in the lim of ln(x) as x approaches infinity. This is always infinity.

Lagrange's theorem on some tests:

5: 5 / ln(5) = 3.1, incorrect
7: 7 / ln(7) = 3.59, incorrect
11: 4.58, incorrect
26: 7.9, incorrect

Using Anderson's Theorem, the number of prime numbers in [1,74] is 39. Lagrange says there is only 12.

Fun With Primes

Let's make some primes to test out my theory versus Lagrange. The goal is the find all of the primes between 1 and N using Lagrange's estimate as the stopping criteria versus mine.

def is_prime(p) :
    if p == 1 or p == 2 or p == 3 :
        return True
    for j in range(2,int(np.sqrt(p)+0.5)+1) :
        a = p // j
        if p - a*j == 0 :
            return False
    return True


I know you Cambridge Maths nerds say that '1' is not prime, but I disagree, so I am putting it into the list of primes. You can hate on that later.

if __name__ == "__main__" :
    max = int(sys.argv[1])
    n_primes = max // 2 + 2
    primes = []
    for i in range(1,max+1) :
        if is_prime(i) :
            primes.append(i)
            if len(primes) == n_primes :
                break

    print("found",len(primes),"primes. Lagrange predicted"int(max / np.log(max)),", I predicted",n_primes)
    print(primes)

Now run that in python and you get the list of prime numbers between 1 and the first command line argument, which is N for argument's sake.

"python mkprime.py 26"

found 10 primes. Lagrange predicted 7 , I predicted 15
me: [1, 2, 3, 5, 7, 11, 13, 17, 19, 23]
Lagrange: [1, 2, 3, 5, 7, 11, 13 ]

Obviously, Lagrange under estimated the number of primes and we missed some crucial primes up to 26. Primes are very important numbers, so we don't want to miss them. They are also very expensive to compute, so we don't want to compute more than we know exist. 

What if we determine a number to be prime and have a last known prime that is adjacent to that candidate prime? How do we know that the candidate is really prime? Is there a rule to follow? I really don't know, but what I did find is that there is a pattern to the placement of primes on the number line. They are not random.

    diffs = []
    diffs.append(0)
    for i in range(1,len(primes)) :
        diffs.append(primes[i] - primes[i-1])
    print(diffs)

Add that to your python, and run it again:

found 10 primes. Lagrange predicted 7 , I predicted 15
[1, 2, 3, 5, 7, 11, 13, 17, 19, 23]
[0, 1, 1, 2, 2, 4, 2, 4, 2, 4]

Hmmm. After '3' the diff is 2 or 4 from the last adjacent prime. Let's shoot it farther out ...

"python mkprime.py 2000"

the diff:
[0, 1, 1, 2, 2, 4, 2, 4, 2, 4, 6, 2, 6, 4, 2, 4, 6, 6, 2, 6, 4, 2, 6, 4, 6, 8, 4, 2, 4, 2, 4, 14, 4, 6, 2, 10, 2, 6, 6, 4, 6, 6, 2, 10, 2, 4, 2, 12, 12, 4, 2, 4, 6, 2, 10, 6, 6, 6, 2, 6, 4, 2, 10, 14, 4, 2, 4, 14, 6, 10, 2, 4, 6, 8, 6, 6, 4, 6, 8, 4, 8, 10, 2, 10, 2, 6, 4, 6, 8, 4, 2, 4, 12, 8, 4, 8, 4, 6, 12, 2, 18, 6, 10, 6, 6, 2, 6, 10, 6, 6, 2, 6, 6, 4, 2, 12, 10, 2, 4, 6, 6, 2, 12, 4, 6, 8, 10, 8, 10, 8, 6, 6, 4, 8, 6, 4, 8, 4, 14, 10, 12, 2, 10, 2, 4, 2, 10, 14, 4, 2, 4, 14, 4, 2, 4, 20, 4, 8, 10, 8, 4, 6, 6, 14, 4, 6, 6, 8, 6, 12, 4, 6, 2, 10, 2, 6, 10, 2, 10, 2, 6, 18, 4, 2, 4, 6, 6, 8, 6, 6, 22, 2, 10, 8, 10, 6, 6, 8, 12, 4, 6, 6, 2, 6, 12, 10, 18, 2, 4, 6, 2, 6, 4, 2, 4, 12, 2, 6, 34, 6, 6, 8, 18, 10, 14, 4, 2, 4, 6, 8, 4, 2, 6, 12, 10, 2, 4, 2, 4, 6, 12, 12, 8, 12, 6, 4, 6, 8, 4, 8, 4, 14, 4, 6, 2, 4, 6, 2, 6, 10, 20, 6, 4, 2, 24, 4, 2, 10, 12, 2, 10, 8, 6, 6, 6, 18, 6, 4, 2, 12, 10, 12, 8, 16, 14, 6, 4, 2, 4, 2, 10, 12, 6, 6, 18, 2, 16, 2, 22, 6, 8, 6, 4, 2]

Whoa. The difference between two adjacent primes is always an even distance. Let's histogram that and also let's make that first element '1' instead of '0' because 1 is '1' away from 0, just to make you Cambridge Maths nerds even more frustrated. Here's the histogram for primes up to 26.

1 : 3
2 : 4
4 : 3

How about for the primes up to 10,000, here is the histogram of the differences between adjacent primes:

1 : 3
2 : 205
4 : 202
6 : 299
8 : 101
10 : 119
12 : 105
14 : 54
16 : 33
18 : 40
20 : 15
22 : 16
24 : 15
26 : 3
28 : 5
30 : 11
32 : 1
34 : 2
36 : 1

Lemma: The difference between any two primes that are larger than 3 is always a factor of 2.

In fact, the majority of primes are 6 away from the next prime. This creates a nifty algorithm for finding primes. Starting with the first prime you find, just check every other number. What's even more interesting is the distribution of primes in [1,1e6], shown in the following figure:

This figure looks almost like a Poisson probability mass function with lambda=4:

What is really looks like is the Nearest Neighbor Routing probability distribution function in Jung, Haejoon & Lee, In-Ho. (2018). Performance Analysis of Millimeter-Wave Multi-hop Machine-to-Machine Networks Based on Hop Distance Statistics. Sensors (Basel, Switzerland). 18. 10.3390/s18010204. See equation 16 in that paper and figure 3. The red line in that paper's Figure 3 matches almost exactly the shape of the distribution of prime numbers.

Every n-digit prime number, where n > 1, ends in either 1, 3, 7, or 9. The likelihood of either of these numbers is almost even. The numbers 1 and 9 are equally likely and 3,7 are equally likely, with a slightly higher chance of being 3 or 7 in the [10,1e8] number range.

1 : 1440298 - 24.99 %
3 : 1440473 - 25.0 %
7 : 1440494 - 25.0 %
9 : 1440186 - 24.99 %

When I ran 1e8 primes, the distance graph turned out a quirk. The distance of 132 appeared 132 times. This following table shows the distribution of distances between adjacent primes in [1,1e8].



The Second Hardy-Littlewood Conjecture

https://en.wikipedia.org/wiki/Second_Hardy%E2%80%93Littlewood_conjecture

I learned about this conjecture today which states that the sum of the number of primes in A and B is greater than the number of primes in A+B. Using my formula to find the number of primes, let's prove this conjecture. The following is a rudimentary demonstration of this conjecture, not necessarily a proof because there is no proof that my theorem is accurate. It's more accurate than Lagrange's method, though. Someone else agrees that this conjecture is true: https://arxiv.org/abs/2101.03283











Popular posts from this blog

THE RISE OF FASCIST SOCIAL MEDIA

The Merriam-Webster dictionary defines fascism as: a tendency toward or actual exercise of strong autocratic or dictatorial control .  The phrase "dictatorial control" is important for the case that I am going to make about fascism in social media. The word "dictatorial" means "of or relating to a dictator," and a dictator is "one ruling in an absolute and often oppressive way." In 2020, social media has seen a rise in the number of autocratic events of censorship. The two social media outlets that I am going to focus on are Facebook and Twitter.  Background Facebook is a semi-private curated blogging platform where you, the user, share information at your leisure. The public part of Facebook is in Facebook Groups. With a group, outside people who are not privy to your "Facebook Wall" will join your group and establish a communal discourse. This can be private, by invitation only, or public. The Facebook is auth-walled so that you must ...

Clustered Foolishness

I had morning coffee with a well respected friend of mine recently. Aside from chatting about the usual wifery and family, we touched on the subject of clustered indices and SQL Server performance. A common misconception in the software industry is that a clustered index will make your database queries faster. In fact, most cases will demonstrate the polar opposite of this assumption. The reason for this misconception is a misunderstanding of how the clustered index works in any database server. A clustered index is a node clustering of records that share a common index value. When you decide on an index strategy for your data, you must consider the range of data to be indexed. Remember back to your data structures classes and what you were taught about hashtable optimizations. A hashtable, which is another way of saying a database index, is just a table of N values that organizes a set of M records in quickly accessible lists that are of order L, where L is significantly less than M. ...

Trademarks In The Dark

If you have a business, then you know that filing for a trademark is pretty easy in the USA. You just go to the USPTO web site ( www.uspto.gov ) and start filling out the form. The cost is significantly less now, nearly a third of what it was a couple of years ago. That's great news. What you don't know about your mark, though, is that there is a plethora of common law that dictates whether or not you can file with your specimens. The specimens are documents that clearly show your mark being used in commerce. Well, my last mark registration came back to me with the examiner asking for a better specimen that places the mark in closer proximity to evidence of commerce. Closer proximity. Yeah. Right. Apparently Lands’ End, Inc. v. Manbeck, 797 F. Supp. 511, 514, 24 USPQ2d 1314, 1316 (E.D. Va. 1992); In re Dell Inc., 71 USPQ2d 1725, 1727-1729 (TTAB 2004); In re MediaShare Corp., 43 USPQ2d 1304 (TTAB 1997); TMEP §§904.06(a) and (b), establish some common law that determines an acce...