To cheer you up in difficult times 5: A New Elementary Proof of the Prime Number Theorem by Florian K. Richter

Here is a piece of news that will certainly cheer you up: Florian Richter found A new elementary proof of the prime number theorem. (I thank Tami Ziegler for telling me about the new result.)

From left to right: Atle Selberg, Florian Richter, and Paul Erdos

From the Introduction

“One of the most fundamental results in mathematics is the Prime Number Theorem, which describes the asymptotic law of the distribution of prime numbers in the integers.

Prime Number Theorem. Let π(N) denote the number of primes smaller or equal to a positive integer N. Then

\lim_{N \to \infty}   \frac {\pi (N)}{N/\log N}~=~1

The Prime Number Theorem was conjectured independently by Gauß and Legendre towards the end of the 18th century and was proved independently by Hadamard and de la Vallée Poussin in the year 1896. Their proofs are similar and rely on sophisticated analytic machinery involving complex analysis, which was developed throughout the 19th century by the combined effort of many great mathematicians of this era, including Euler, Chebyshev, and Riemann (to name a few).

Even though it was believed for a long time not to be possible, more than 50 years after the analytic proof of the Prime Number Theorem was discovered, an elementary proof was found by Erdős and Selberg [Erd49; Sel49]. In this context, elementary refers to methods that avoid using complex analysis and instead rely only on rudimentary facts from calculus and basic arithmetic identities and inequalities. Their approach was based on Selberg’s famous “fundamental formula”:

\sum_{p \le x}\log ^2(p)~+~\sum_{pq \le x}\log (p)\log (q)~=~2x\log (x)+O(x).

In this paper we provide a new elementary proof of the Prime Number Theorem, using ideas inspired by recent developments surrounding Sarnak’s and Chowla’s conjecture. With the exception of Stirling’s approximation formula, which is used in Section 3 without giving a proof, our proof is self-contained.”

The introduction also cites papers on the history of the analytic method and an abridged version of it by Newman, on several earlier elementary proofs of the PNT, and on recent dynamically inspired way by McNamara of deriving the Prime Number Theorem from Selberg’s fundamental formula.

Some thoughts and remarks

What precisely is elementary, formally?

What is the formal distinction between elementary and non-elementary proofs of the PNT? (This was explained to me several times but I don’t remember the details.) The distinctions between “non-explicit” and “explicit” proofs (or constructions) and between “effective” and “non-effective” proofs are quite clear cut. But here I am less certain.

Are there other important distinctions about proofs that you are aware of?

PNT and Boolean functions (weighted majority functions)

An equivalent formulation of the prime number theorem is the following:

(*)~~~ \sum_{k \le n}\mu (k)~=~o(n),

where \mu is the Mobius function. Consider now a Boolean function f(x_1,x_2, \dots, x_n) with n variables. Here the variables x_i and the function f has values in \{0,1\}. Given n nonnegative weights w_1\le w_2 \le \cdots \le w_n and C>0 the function $f$ is a weighted majority function if f=1 iff w_1x_1+\cdots +w_nx_n \le C. Let p(f) be the probability that f=1, and let \hat f[n] be the top Fourier coefficient of f.


(1) Find conditions for a weighted majority function f that guarantee that

(**)~~~ \hat f([n])~=~o( p(f))

Let p_k be the kth prime, w_k = \log p_k, and C=\log n. The PNT asserts that the Boolean function associated to these parameters satisfies (**). (It is very easy to see that (**) and (*) are essentially the same.)  From vague understanding of some fragments of older elementary proofs of the PNT it looks that the proofs could perhaps be described in this way, namely starting with some weaker properties on the weights (namely the primes)  perhaps the derivation of (**) apply to general weighted majority functions with these properties. (This is related to PNT for systems of Beurling primes.) Finding conditions for (**) is interesting also for other weighted majority functions where the weights are not related to the growth behavior of the primes.  In fact, the following is also interesting:

(2) Find conditions for (general) Boolean functions f that guarantee that (**) holds, namely that the top Fourier coefficients is little o of p(f). Signs of low degree polynomials form a very interesting class of  Boolean functions, and they are of special interest also here.

Remark: The first problem looks very related to PNT for systems of Beurling primes going back to 1937 paper by Arne Beurling. (Based on the inequality 1937 < 1949, I understand that Breuling’s proof was not elementary, and I don’t know if the elementary proofs for PNT gives Breuling’s theorem as well.)

PNT here on the blog and Mobius randomness

The new paper is related to conjectures by Chowla and Sarnak and to the notion of Mobius randomness that we considered in an earlier post. The basic question is for which functions f of k, {\bf (***)} \sum_{k \le n} \mu (k)f(k) = o(\sum_{k \le n}f(k)). This is correct if f is random (or random-like in a variety of meanings), and it turns out to be correct if (in some sense) if f is of low complexity. We also mentioned some analogs of the PNT in this post. Both polymath 4 and polymath5 had some connections to the PNT.

Is PNT to expanders like RH to Ramanujan graphs?

This is another connection to combinatorics, See this post in “In Theory”.

This entry was posted in Number theory, Updates and tagged , , . Bookmark the permalink.

11 Responses to To cheer you up in difficult times 5: A New Elementary Proof of the Prime Number Theorem by Florian K. Richter

  1. John Baez says:

    I would guess that an elementary proof of the prime number theorem can be formalized in Peano Arithmetic, while a non-elementary one might use a more powerful set of axioms. For example, RCA0 is a weak version of second-order arithmetic that allows you to do some analysis which goes beyond Peano arithmetic, like prove the intermediate value theorem. The link also lists some stronger systems, which together with RCA0 form “the big five”: most everyday math can be done in one of these systems.

    • Gil Kalai says:

      Dear John, many thanks for the comment and link. Certainly the axiomatic setting that allows the proof is a major distinction between various proofs. However I am not sure how crucial this distinction is to “real-life” proofs and, in particular, is it relevant to the elementary/non-elementary distinction.

  2. wdjoyner says:

    Arne Breuling is spelled Arne Beurling.

    GK: Thanks corrected.

  3. Thanks for a *very* interesting blog!

    Your question, “What precisely is an elementary proof?” reminds me of an analogous question in algorithm design: “What precisely is a combinatorial algorithm?”
    I don’t believe there is a “formal” definition of “combinatorial” algorithm, but one can tell it apart from a “non-combinatorial” one, just as easily as one can tell oil from water …

  4. segevgonen says:

    In the context of number theory, I always thought ‘elementary’ just meant ‘without using complex analysis’ but I suppose this distinction can be difficult at times?

  5. Ben Green says:

    Hi Gil. At some point in the past I also noted that (*) can be written as (**) and then wondered whether there is some fairly general result about weighted Hamming balls from which PNT follows as a corollary.

    Having thought about this a bit more, I am not so optimistic, except when the weights w_i are very small perturbations of log p_i, in which case something might follow from Beurling’s PNT as you say (but I’m not quite sure; there’s also the issue of whether cancellation of “Mobius” is equivalent to PNT for Beurling primes – the book “Beurling generalised numbers” by Diamond and Zhang should clear this up).

    The main reason I am not optimistic is it seems that saying anything about p(f) in general, even an upper bound which one expects to be within a constant factor of the truth, seems impossible. For instance, if all the weights w_i are 1, then p(f) has big jumps at integer values of C (and also if C << n/2 then case f^[n] will *not* be o(p(f)) since the character (-1)^{x_1 + …+ x_n} is constant on each Hamming sphere, and the contribution from the outer one dominates) . Regarding p(f), it seems lots of other strange things can happen, even when the w_i are some subset of the log p_is. See page 2 of Granville, Koukoulopoulos and Matomaki's paper "when the sieve works" for some examples.

    I suppose it's vaguely plausible that one could show f^[n] = o(p(f)) without understanding what p(f) is, but this does seem rather unlikely.

    • Gil Kalai says:

      Hi Ben, many thanks for your comment. I guess some cases I will be curious to know the situation are:

      1) w_k = log k. For which C we have \hat f([n])=o(p(f)? (This corresponds to declaring all integers as primes which is anyway a just declaration.)

      2) w_k=\log k + \log \log k and C=\log n. So this corresponds to the growth behavior of primes without the arithmetic properties. (And we can play with the value of C.)

      3) Take every prime below n with probability 1/2 , and take their logs. (Maybe we keep C=\log n, but see 4) )

      4) take w_k = \log p_k but vary C. In other words we want to consider all integers below n and all primes below m and count # integers that are product of even number of these primes MINUS # integers that are product of odd number of these primes.

      I suppose that at least 3 and 4 are known (but not to me), and even when PNT holds (in the mobius formulation) it is interesting when we have stronger facts a la RH.

      • Ben Green says:


        Yes, 3 and 4 could certainly be answered I think. My interest in this whole line of enquiry, though, was whether one could prove something about general weighted Hamming balls which implied PNT and moreover in such a way that it shed light on the proof of PNT. Having thought about things somewhat more I rather doubt that.


      • Gil Kalai says:

        Ben, I agree that expecting PNT for general form of weighted majority functions is a long shot. The reason I asked about 3 and 4 is that while these questions are close to the original NT they can give a hint about possible extensions (or why extensions are impossible). Also 1 and 2 seems natural in order to examine such potential extensions. I vaguely remember that for question 3 I was once told that the PNT holds but not the RH.

        Another condition I thought about is this: Consider weights w_k where w_k=log (k)(1+0(1)). Suppose that

        (*) the number of sums of the w_i‘s in every interval [x, x+b/x] is at most “1” when b is smaller than some positive b_1, and the number of sums of the w_i‘s in every interval [x, x+b/x] is at least “1” when b is larger b_2

        Then for C ~log n, PNT holds (or even stronger RH-style statements).

        (Most likely it is false but maybe this is “cheating” as (*) may force us to return to the arithmetic scenario.)

      • Gil Kalai says:

        Ben, one more remark is this. The conditions to guarantee the conclusion that \hat f([n])=o(p(n) can involve statements not only about the weights but also about other parameters of the weighted majority functions (e.g. influences, and the low degree Fourier coefficients). Therefore I am not sure that Beurling PNT is most relevant to this inquiry but perhaps rather trying to extend various elementary proofs for the PNT to this setting.

        (PS. I think we can propose a nice joint MO question on this problem.)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s