To cheer you up in difficult times 9: Alexey Pokrovskiy proved that Rota’s Basis Conjecture holds asymptotically

Pokrovskiy’s startling morning  rainbow

Rota’s Basis Conjecture holds asymptotically, by Alexey Pokrovskiy

Abstract: Rota’s Basis Conjecture is a well known problem from matroid theory, that states that for any collection of n bases in a rank n matroid, it is possible to decompose all the elements into n disjoint rainbow bases. Here an asymptotic version of this is proved. We show that it is possible to find n − o(n) disjoint rainbow independent sets of size n − o(n).

A rainbow basis is a basis with one element from each collection.

(I thank Nati Linial for telling me about it.)

Another way to formulate Rota’s basis conjecture (for representable matroids) is that if B1B2, …, Bn are n bases of an n-dimensional vector space V (not necessarily distinct or disjoint), then there exists an n × n grid of vectors (vij) such that

1. the n vectors in row i are the members of the ith basis Bi (in some order), and

2. in each column of the matrix, the n vectors in that column form a basis of V.

If all the bases are the standard basis then this reduces to the existence of Latin squares.

Unrelated trivia question:  AGC-GTC-TGC-GTC-TGC-GAC-GATC-? what comes next in the sequence?

We mentioned Rota’s basis conjecture in various earlier posts.  A classic paper on the subject is the 1989 paper by Rosa Huang and Gian Carlo-Rota. Three and a half years ago Timothy Chow lunched a polymath project (Polymath 12) to solve it. (Here is my post on the project with various variants of the conjecture, the first post on the polymath blog, and the wiki). See this post for several famous conjectures by Rota, and this post about the related Alon-Tarsi conjecture.

Posted in Combinatorics, Updates | Tagged , , | 4 Comments

To Cheer you up in Difficult Times 8: Nathan Keller and Ohad Klein Proved Tomaszewski’s Conjecture on Randomly Signed Sums

Today we talk about the paper, Proof of Tomaszewski’s Conjecture on Randomly Signed Sums, by Nathan Keller and Ohad Klein.

Consider a unit vector a=(a_1,a_2,\dots, a_n). That is latex \sum_{i=1}^n a_i^2=1$. Consider all (2^n) signed sums \displaystyle \epsilon_1a_1+\epsilon_2a_2+\cdots +\epsilon_n a_n, where each \epsilon_k is either 1 or -1.

Theorem (Keller and Klein (2020) asked by Boguslav Tomaszewski (1986)): For at least 2^{n-1} signed sums \displaystyle |\epsilon_1a_1+\epsilon_2a_2+\cdots +\epsilon_n a_n| \le 1 .

Another way to state the theorem is that the probability of a signed sum to be in the interval [-1, 1] is at least 1/2.

To see that this is best possible consider the case that n=2 and let a_1,a_2 be non zero. For the sum in question to exceed one we need both summands to have the same sign which happens half of the times. There is another example of importance, the vector (\frac{1}{2}, \frac{1}{2}, \frac {1}{2}, \frac{1}{2}). Here 3/8 of the absolute values of signed sums (6 out of 16) are below 1 (in fact, equal to zero), 1/2 equal to 1 and 1/8 exceed 1. Holzman and Kleitman proved in 1992 that the fraction of absolute values of signed sums below 1 is always at least 3/8.

Congratulations to Nathan and Ohad. I will say a little more about the problem below but before that, a few more things.

A few more things

Luca Trevisan posted on his blog In Theory a post “Silver linings” about two cheerful pieces of news. The first one is “Karlin, Klein, and Oveis Gharan have just posted a paper in which, at long last, they improve over the 1.5 approximation ratio for metric TSP which was achieved, in 1974, by Christofides.”

The second  one is about breaking the logarithmic barrier for Roth’s theorem that we wrote about here. This was also discussed by Bill Gasarch on Computational Complexity. In the comment section of my post there is an interesting discussion regarding timetable for future achievements and how surprising they would be.

The third is about Ron Graham, a friend and a mathematical giant who passed away a few days ago. Here is a moving post by Fan Chung, a web page for Ron set by Fan, and a blog post by Dick and Ken on GLL.

The fourth is that there is a nice collection of open problems on Boolean functions that is cited in the paper of Nathan and Ohad:  Y. Filmus, H. Hatami, S. Heilman, E. Mossel, R. O’Donnell, S. Sachdeva, A. Wan, and K. Wimmer, Real analysis in computer science: A collection of open problems.

The fifth is that both our (HUJI) combinatorics seminar and basic notions seminar are running and are recorded. Here are the links. (Hmm, the links are not yet available, I will update.)

Back to the result of Keller and Klein

Daniel Kleitman and Ron Holzman

A quick orientation

If the a_is are all the same, or small, or random, then to compute the probability that the weighted sum is between -1 and 1, we can use some Gaussian approximation and then we will find ourselves in a clash of constants that goes our way. The probability will be close to a constant well above 1/2. So what we need to understand is the case where some a_is are large.

Early papers on the problem

The problem first appeared in the American Math Monthly.  Richard Guy collected several problems  and challenged the readers Any Answers Anent These Analytical Enigmas? (I don’t know what the fate of the other questions is.)  Holzman and Kleitman proved in 1992 that the fraction of absolute values of signed sums below 1 is always at least 3/8, and this is tight. For many years, 3/8 was the record for the original problem, until  the 2017 paper by Ravi Boppana and Ron Holzman: Tomaszewski’s problem on randomly signed sums: Breaking the 3/8 barrier, where a lower bound of 0.406, was proved. The current record 0f 0.46 was proved in the paper Improved Bound for Tomaszewski’s Problem by Vojtěch Dvořák, Peter van Hintum, and Marius Tiba. The new definite result by Nathan and Ohad used some ideas of these early papers.

What is the crux of matters

Let me quote what the authors kindly wrote me:

“The crux of the matter is how to deal with the case of very large coefficients (a_1+a_2>1). We gave a short semi-inductive argument covering this case (this is Section 5 of the paper). The argument is only semi-inductive, as it requires the full assertion of Tomaszewski for any n'<n, and gives only the case (a_1+a_2>1) for n. But this means that if we can handle all other cases by other methods then we will be done.

The semi-inductive argument takes only 3 pages. Handling the other cases takes 72 more pages and requires several new tools, but is closer to things that were done in previous works. (Actually, after we found the 3-page argument, we were quite sure we will be able to finalize the proof; this indeed happened, but took a year).”

Most of the paper deals with the case of small coefficients. This requires several ideas and new tools.

Rademacher sums: Improved Berry-Esseen and local tail inequalities

If all coefficients are “sufficiently small”, then we can
approximate X by a Gaussian and the inequality should follow. However, using the standard Berry-Esseen bound, this holds only if all coefficients are less than 0.16.
Nathan and Ohad showed that for Rademacher sums, namely random variables of the form X=\sum a_i x_i, as discussed in the conjecture, a stronger Berry-Esseen
bound can be obtained, and this bound shows immediately that Tomaszewski’s assertion holds whenever all coefficients are less than 0.31. The stronger bound stems
from a method of Prawitz, presented in the 1972 paper. H. Prawitz, Limits for a distribution, if the characteristic function is given in a finite domain, which appeared in the Scandinavian Actuarial journal.

The second tool is local tail inequalities for Rademacher sums, of the form \Pr[a<X<b] \leq \Pr[c<X<d], where a,b,c,d satisfy certain conditions. Inequalities of this kind were obtained before by Devroye and Lugosi in the 2008 paper:  Local tail bounds for functions of independent random variables.

These local tail inequalities already have some other applications, e.g., to analysis of Boolean functions. They were developed and applied in an earlier paper paper of Keller and Klein: Biased halfspaces, noise sensitivity, and relative Chernoff inequalities. Let me mention my related MO question A variance tail description for continuous probability distributions.

A couple more ingredients

A  stopping time argument. Variants of Tomaszewski’s problem appeared in various fields. The problem was stated independently in a 2002 paper by Ben-Tal, Nemirovski, Roos, Robust solutions of uncertain quadratic and conic-quadratic problems.  A stopping time argument introduced there (for proving a lower bound of 1/3) played a crucial role in subsequent works and the critical semi-inductive argument by Nathan and Ohad.

Refinements of the famous Chebyshev’s inequality.  (Did you know Chebyshev’s full name? Ans: Pafnuty Lvovich Chebyshev.)

Questions and connections that come to mind

Q1: What can be said about families \cal F of signs that can serve as those signs for which  |\epsilon_1a_1+\epsilon_2a_2+\cdots +\epsilon_n a_n| \le 1 , for some vector a.

Q2: What can be said about the complex version or even more generally about high dimensions?

Q3: Are there any relations to Littlewood-Offord type problems?

Q4: Is there any relation to the Komlos Conjecture?

See also this MO question by Luca Trevisan and this one by George Lowther.

Is there a simpler proof?

We can ask about simpler or just different proofs for almost every result we discuss here. But here the statement is so simple…

Posted in Analysis, Combinatorics, Probability | Tagged , , | 11 Comments

Noam Lifshitz: A new hypercontractivity inequality — The proof!

This is a guest post kindly contributed by Noam Lifshitz. Here is a pdf version.  This post is a continuation of the post  To cheer you up in difficult times 3: A guest post by Noam Lifshitz on the new hypercontractivity inequality of Peter Keevash, Noam Lifshitz, Eoin Long and Dor Minzer, and it gives the proof of the new hypercontractive inequality. We plan a third post where various applications will be mentioned.

Before we get to the post I want to mention that there are a lot of activities on the web. I may devote a special post to links and discussion (and contributing links in the comment section is very welcome.) but meanwhile a few links:  1) Advances in Boolean Function Analysis Lecture Series (thanks to Avishay Tal and Prasad Raghavendra for letting me know); 2) Online course in Foundations of Algebraic Geometry Given by Ravi Vakil from Stanford. You can take the course at varying levels of involvement. (Thanks to Tami Ziegler for telling me) A very very interesting way of online teaching. 3) A site with online mathematical lectures.


Aline Bonami with Szilard Revesz and me (2006). Aline Bonami first proved the 2-point hypercontractive inequality which is very useful in the analysis of Boolean functions. (Leonard Gross proved it independently a few years later and William Beckner found important applications to harmonic analysis.)

Proof of the new hypercontractivity inequality

Our aim is to prove the hypercontractivity theorem for global functions. The proof here is taken from a joint paper with David Ellis and Guy Kindler that’ll soon be out on the Arxiv.

Theorem 1:

\displaystyle \|\mathrm{T}_{1/100}f\|_{4}^{4}\le\sum_{S\subseteq\left[n\right]}\mathbb{E}_{x\in\left\{ 1,\ldots,m\right\} ^{S}}\|L_{S}[f]_{S\rightarrow x}\|_{2}^{4},

Here we use the notations given in the last blog post. Let us first get a feel for our hypercontractivity theorem by proving the {n=1} case. Here the RHS is {\|f\|_{2}^{4}+\|f-\mathbb{E}\left[f\right]\|_{4}^{4}.}

1. Proof of the {n=1} case

We will prove the following slightly stronger version of Theorem 1 for the {n=1} case.

Proposition 2:

Let {f\colon\left\{ 1,\ldots,m\right\} \rightarrow\mathbb{C}.} Let {\rho\le\frac{1}{10}.} Then
\displaystyle \|\mathrm{T}_{\rho}f\|_{4}^{4}\le\|f\|_{2}^{4}+\|f-\mathbb{E}\left[f\right]\|_{4}^{4}.

Proof: Let us write {L\left[f\right]} for {L_{1}\left[f\right]=f-\mathbb{E}\left[f\right].} Rearranging, we have
\displaystyle f=\mathbb{E}\left[f\right]+L\left[f\right].
The noise operator in the {n=1} case is by definition equal to {\rho Id+\left(1-\rho\right)\mathbb{E},} where {\mathbb{E}} is the expectation over {\text{\ensuremath{\left\{ 1,\ldots,m\right\} }}} operator, and {Id} is the identity operator. Hence,
\displaystyle \mathrm{T}_{\rho}f=\mathbb{E}\left[f\right]+\rho L[f].

Now when expanding the 4-norm of the function {\|\mathrm{T}_{1/100}f\|_{4}^{4}}, we obtain

\displaystyle \|\mathrm{T}_{\rho}f\|_{4}^{4}  \le\left|\mathbb{E}\left[f\right]\right|^{4}
\displaystyle +6\rho^{2}\left|\mathbb{E}\left[f\right]\right|^{2}\|Lf\|_{2}^{2}+
\displaystyle  +4\rho^{3}\left|\mathbb{E}\left[f\right]\right|\|L\left[f\right]\|_{3}^{3}+
\displaystyle + \rho^{4}\|L\left[f\right]\|_{4}^{4},

where we used the fact that the expectation of {L\left[f\right]} is 0. When looking at the right hand side of the global hypercontractivity theorem, we see most of the above terms except for the one involving the third norm of the Laplacian. Indeed we have

\displaystyle RHS =\|f\|_{2}^{4}+\|L\left[f\right]\|_{4}^{4}
\displaystyle =\left|\mathbb{E}\left[f\right]\right|^{4}
\displaystyle +2\|L\left[f\right]\|_{2}^{2}\left|\mathbb{E}\left[f\right]\right|^{2}
\displaystyle +\|L\left[f\right]\|_{2}^{4}
\displaystyle +\|L\text{\ensuremath{\left[f\right]\|}}_{4}^{4}.

Hence we see that the only term in the left hand side that doesn’t appear with a greater coefficient in the left hand side is the term {\left|\mathbb{E}\left[f\right]\right|\|L\left[f\right]\|_{3}^{3},} and by AM-GM we have

\displaystyle \left|\mathbb{E}\left[f\right]\right|\|L\left[f\right]\|_{3}^{3} \displaystyle =\mathbb{E}\left[\left|\mathbb{E}\left[f\right]\right|\left|L\left[f\right]\right|^{3}\right]
\displaystyle \le\mathbb{E}\left[\frac{\left|\mathbb{E}\left[f\right]\right|^{2}\left|L\left[f\right]\right|^{2}+\left|L\text{\ensuremath{\left[f\right]}}\right|^{4}}{2}\right]
\displaystyle =\frac{1}{2}\left|\mathbb{E}\left[f\right]\right|^{2}\|L\left[f\right]\|_{2}^{2}+\frac{1}{2}\|L\left[f\right]\|_{4}^{4},

which allows us to upper bound the only term appearing in the left hand side but not in the right hand side by corresponding terms that do appear in the right hand side. \Box

2. Tensorisation lemma

Next we are going to prove a theorem that doesn’t seem to fit to our setting, but we’re going to fit it in by force. Let {X,Y} be finite sets. Let us write {\mathcal{F}\left(X\right)} for the linear space of complex valued functions on {X}. The space {\mathcal{F}\left(X\times Y\right)} can be identified with the space {\mathcal{F}\left(X\right)\otimes\mathcal{F}\left(Y\right),} where a pair of function {f\otimes g} is identified with the function
\displaystyle \left(x,y\right)\mapsto f\left(x\right)g\left(y\right)
in {\mathcal{F}\left(X\times Y\right).}

Given two operators {A_{1}\colon\mathcal{F}\left(X_{1}\right)\rightarrow\mathcal{F}\left(Y_{1}\right),A_{2}\colon\mathcal{F}\left(X_{2}\right)\rightarrow\mathcal{F}\left(Y_{2}\right)}, the operator {A_{1}\otimes A_{2}\colon\mathcal{F}\left(X_{1}\times X_{2}\right)\rightarrow\mathcal{F}\left(Y_{1}\times Y_{2}\right)} is the unique operator sending {f\otimes g} to {A_{1}f\otimes A_{2}g}. We write {A^{\otimes n}} for {A\otimes\cdots\otimes A.} The operator {A_{1}\otimes A_{2}} can also be defined more explictly in terms of its values on functions. The operator {A_{1}\otimes A_{2}} can be understood more explicitly by noting that it is the composition of the operators {A_{1}\otimes I} and {I\otimes A_{2}.} Now the operator {A\otimes I} is given by {A\otimes If\left(x,y\right)=Af_{y}\left(x\right),} where {f_{y}\left(x\right)=f\left(x,y\right).}

Lemma 3: Let {X,Y,Z} be measure spaces with finite underlying sets. Let {A\colon\mathcal{F}\left(X\right)\rightarrow\mathcal{F}\left(Y\right),B\colon\mathcal{F}\left(X\right)\rightarrow\mathcal{F}\left(Z\right)} be operators satisfying

\displaystyle \|Af\|_{4}\le\|Bf\|_{4}

for all functions {f\in\mathcal{F}\left(X\right).}

\displaystyle \|A^{\otimes n}f\|_{4}\le\|B^{\otimes n}f\|_{4}

for all {f\in\mathcal{F}\left(X^{n}\right).}

Here the spaces {X^{n},Y^{n},} and {Z^{n}} are equipped with the product measure, where the measure of an atom is the product of the measures of its coordiates.

Proof: For each {y\in X,} let {g_{y}} be given by {g_{y}:=A^{\otimes\left(n-1\right)}f\left(\cdot,y\right).} As mentioned {A^{\otimes n}f\left(x,y\right)=Ag_{y}\left(x\right).} Hence by hypothesis, we have

\displaystyle \mathbb{E}\left[\left|\mathrm{A}^{\otimes n}f\right|^{4}\right] \displaystyle =\mathbb{E}_{y}\mathbb{E}_{x}\left|Ag_{y}\left(x\right)\right|^{4}
\displaystyle  \le\mathbb{E}_{y}\mathbb{E}_{x}\left|Bg_{y}\left(x\right)\right|^{4}
\displaystyle =\|A^{\otimes n-1}\otimes B\|_{4}^{4}. We may now repeat the same process on each of the other coordinates to replace the {A}s by {B}s one by one. \Box

3. The main idea: Fourifying the 2-norms.

The strategy of our proof is to take the theorem

\displaystyle \|\mathrm{T}_{\rho}f\|_{4}\le\|f\|_{2}^{4}+\|f-\mathbb{E}\left[f\right]\|_{4}^{4},

which we established in the {n=1} case for {\rho\le\frac{1}{10}}, and to turn it into an essentially equivalent statement about 4-norms. We will then get a tensorised statement for general {n}, which we will be able to convert back into our hypercontractivity theorem for global functions. Our idea is to encode our function {f} as a function {\mathrm{En}\left(f\right)\colon\left\{ -1,1\right\} ^{n\left(p-1\right)}\rightarrow\mathbb{R}} satisfying

\displaystyle \mathrm{En}\circ T_{\rho}=\mathrm{T}_{\rho}\circ\mathrm{En}


\displaystyle \|\mathrm{En}f\|_{2}=\|f\|_{2}.

The benefit of working with {\mathrm{En}f} rather than {f} is that in {\left\{ 0,1\right\} ^{n\left(p-1\right)}} one may move between 4-norms and 2-norms by appealing to the hypercontractivity theorem there, which gives

\displaystyle \|\mathrm{T}_{\frac{1}{\sqrt{3}}}\circ\mathrm{En}f\|_{4}\le\|\mathrm{E}nf\|_{2}\le\|\mathrm{En}f\|_{4}

at the cost of some noise.

To define {\mathrm{En}} we use Fourier analysis of Abelian groups. Let us briefly recall it. For simplicity let us assume that {f\colon\left(\mathbb{Z}/p\mathbb{Z}\right)^{n}\rightarrow\mathbb{C},} where {p} is a prime. Let {\omega} be a {p}th root of unity. For any {\gamma\in\left(\mathbb{Z}/p\mathbb{Z}\right)^{n}} we have a character {\chi_{\gamma}\colon\left(\mathbb{Z}/p\mathbb{Z}\right)^{n}\rightarrow\mathbb{C}} given by {\chi_{\gamma}\left(x\right)=\omega^{\left\langle \gamma,x\right\rangle }.} The {\chi_{\gamma}} are an orthonormal basis of {\left(\mathbb{Z}/p\right)^{n}} and we write {f=\sum\hat{f}\left(\gamma\right)\chi_{\gamma}}, where {\hat{f}\left(\gamma\right)=\left\langle f,\chi_{\gamma}\right\rangle .} Note that {\chi_{0}} is the constant function, and so we have

\displaystyle \hat{f}\left(0\right)=\left\langle f,\chi_{0}\right\rangle =\mathbb{E}\left[f\right],

which gives

\displaystyle f=\mathbb{E}\left[f\right]+\sum\hat{f}\left(i\right)\chi_{i}.

Our mission will first be to convert the {2}-norm of a function {f\colon\mathbb{Z}/p\rightarrow\mathbb{R}} to the {4-}norm of a different function.

We define an encoding operator {\mathrm{En}\colon\mathbb{Z}/p\mathbb{Z}\rightarrow\left\{ -1,1\right\} ^{p-1}} by setting

\displaystyle f\mapsto\mathbb{E}\left[f\right]+\sum_{i\in\left\{ 1,\ldots,p-1\right\} }\hat{f}\left(i\right)x_{i}.

We have

\displaystyle \|f\|_{2}^{2}=\|\mathrm{En}f\|_{2}^{2},

as the {\chi_{i}} are orthonormal and so are the {x_{i}.} Moreover, {\mathrm{T}_{\rho}\circ\mathrm{En}=\mathrm{En}\circ T_{\rho}} by the Fourier formula for {\mathrm{T}_{\rho}.} Since {2}-norms are always smaller than 4-norms on probability spaces, we’ve got the following corollary of Proposition 2.

Lemma 4. For all {\rho\le\frac{1}{10}} and all {f\colon\mathbb{Z}/p\mathbb{Z}\rightarrow\mathbb{C}} we have

\displaystyle \|\mathrm{T}_{\rho}f\|_{4}^{4}\le\|\mathrm{En}\left(f\right)\|_{4}^{4}+\|f-\mathbb{E}\left[f\right]\|_{4}^{4}.

We now reach the final little trick. We define a measure space {Y} whose underlying set is {\mathbb{Z}/p\mathbb{Z}\sqcup\left\{ 0,1\right\} ^{\left\{ p-1\right\} },} and where the measure is given by {\mu\left(i\right)=\frac{1}{p}} for {i\in\mathbb{Z}/p\mathbb{Z}} and {\mu\left(x\right)=\frac{1}{2}^{p-1}} for {x\in\left\{ -1,1\right\} ^{p-1}.} We let {B\colon\mathcal{F}\left(X\right)\rightarrow\mathcal{F}\left(Y\right)} be given by {Bf=\mathrm{En}f} on {\left\{ -1,1\right\} ^{p-1}} and letting it be {f-\mathbb{E}\left[f\right]} on {\mathbb{Z}/p\mathbb{Z}.} This way Lemma 4 takes the form {\|\mathrm{T}_{\rho}f\|_{4}\le\|Bf\|_{4}.}

4. Tensorised operators

The operator {\mathrm{T}_{\rho}} on {\mathbb{Z}/p\mathbb{Z}^{n}} satisfies {\mathrm{T}_{\rho}=\mathrm{T}_{\rho}^{\otimes n},} where the latter {\mathrm{T}_{\rho}} refers to the noise operator on {\mathbb{Z}/p.} The characters {\chi_{\gamma}} satisfy {\chi_{\gamma}=\bigotimes\chi_{\gamma_{i}},} and so we have the Fourier formula

\displaystyle \mathrm{T}_{\rho}f  =\sum_{\gamma}\rho^{\#\left\{ i:\gamma_{i}\ne0\right\} }\hat{f}\left(\gamma\right)\chi_{\gamma}.

We also have

\displaystyle L_{S}\left[f\right]=\bigotimes_{i\in S}\left(f\mapsto f-\mathbb{E}\left[f\right]\right)\otimes\bigotimes_{i\notin S}Id,

and so

\displaystyle L_{S}\left[f\right]=\sum_{\gamma:\gamma_{i}\ne0\text{ for all }i\in S}\hat{f}\left(\gamma\right)\chi_{\gamma}.

This will allow us to conclude that

\displaystyle L_{S}\left[\mathrm{T}_{\rho}f\right]_{S\rightarrow x}=\rho^{\left|S\right|}T_{\rho}L_{S}[f]_{S\rightarrow x}.

We will also encounter the operator {\mathrm{En}^{\otimes n},} which by abusing notation we also call {\mathrm{En}} encodes

\displaystyle f=\sum_{\gamma}\hat{f}\left(\gamma\right)\chi_{\gamma}

as the function {\sum_{\gamma}\widehat{f}\left(\gamma\right)\prod_{i=1}^{n}x_{pi+\gamma_{i}}} on {\left\{ -1,1\right\} ^{n\left(p-1\right)}.}
Now finally we can get to the understanding of the operator {B^{\otimes n}.} The space {Y^{n}} is the disjoint union of {2^{n}} spaces of the form

\displaystyle \left(\mathbb{Z}/p\mathbb{Z}\right)^{S}\times\left(\left\{ -1,1\right\} ^{p-1}\right)^{\left[n\right]\setminus S}.

By definition of the tensor product, for {x,y\in\left(\mathbb{Z}/p\mathbb{Z}\right)^{S}\times\left(\left\{ -1,1\right\} ^{p-1}\right)^{\left[n\right]\setminus S}} is the function

\displaystyle B^{n}f\left(x,y\right)=\mathrm{En}\left(L_{S}[f]_{S\rightarrow x}\right)\left(y\right).

5. Finishing the proof

Proof: Lemmas 3 and 4 yield:

\displaystyle \|\mathrm{T}_{\rho}f\|_{4}^{4}  \le\|B^{\otimes n}f\|_{4}^{4} \displaystyle =\sum_{S\subseteq\left[n\right]}\mathbb{E}_{x\sim\mathbb{Z}/p\mathbb{Z}^{S}}\|\mathrm{En}\left(L_{S}[f]_{S\rightarrow x}\right)\|_{4}^{4},

for any {\rho\le\frac{1}{10}.} We now have

\displaystyle \|\mathrm{T}_{\frac{\rho}{\sqrt{3}}}f\|_{4}^{4} \displaystyle =\sum_{S\subseteq\left[n\right]}(\frac{1}{\sqrt{3}})^{\left|S\right|}\mathbb{E}_{x\sim\mathbb{Z}/p\mathbb{Z}^{S}}\|\mathrm{T}_{\frac{1}{\sqrt{3}}}\left(\mathrm{En}\left(L_{S}[f]_{S\rightarrow x}\right)\right)\|_{4}^{4},
\displaystyle  \le\sum_{S\subseteq\left[n\right]}\mathbb{E}_{x\sim\mathbb{Z}/p\mathbb{Z}^{S}}\|\mathrm{En}\left(L_{S}[f]_{S\rightarrow x}\right)\|_{2}^{4}
\displaystyle =\sum_{S\subseteq\left[n\right]}\mathbb{E}_{x\sim\mathbb{Z}/p\mathbb{Z}^{S}}\|L_{S}[f]_{S\rightarrow x}\|_{2}^{4}.

The first equality follows from the formula {L_{S}\left[\mathrm{T}_{\rho}f\right]_{S\rightarrow x}=\rho^{\left|S\right|}\mathrm{T}_{\rho}L_{S}\left[f\right]_{S\rightarrow x}} and the fact that {\mathrm{T_{\rho}}} commutes with the encoding. The inequality used hypercontractivity on the discrete cube. The last equality follows from the fact that the {\mathrm{En}} operator preserves 2-norms. \Box






Posted in Analysis, Combinatorics, Guest blogger | Tagged | Leave a comment

To cheer you up in difficult times 7: Bloom and Sisask just broke the logarithm barrier for Roth’s theorem!


Thomas Bloom and Olof Sisask: Breaking the logarithmic barrier in Roth’s theorem on arithmetic progressions,    arXiv:200703528


Once again Extraordinary news regarding Roth Theorem! (I thank Ryan Alweiss for telling me about it and Rahul Santhanam for telling me about Thomas and Olof’s earlier attempts.)

Suppose that R_n  is a subset of \{1,2,\dots, n \} of maximum cardinality not containing an arithmetic progression of length 3. Let r_3(n)=|R_n|. Roth proved that r_3(n)=o(n).

A few days ago Thomas Bloom and Olof Sisask proved that for some c>0

r_3(n) \le \frac {n}{\log^{1+c} n}

This is an extraordinary result!!! I will tell you a little more about it below.

Ron Graham

I just heard yesterday the sad news that Ron Graham passed away. Ron was an extraordinary mathematician and an extraordinary person. I first met Ron in Montreal in 1978 and we met many times since then. Ron will be dearly missed.

Back to the new bounds on Roth’s theorem

From an abstract of a lecture by Thomas and Olof: “This is the integer analogue of a result of Bateman and Katz for the model setting of vector spaces over a finite field, and the proof follows a similar structure.”

A catchy (weaker) formulation which goes back to Erdos and Turan is:

Let a_n be a sequence of integers so that \sum \frac{1}{a_n} = \infty, then the sequence contains an arithmetic progression of length three!!

Bloom and Sisask’s result implies, of course, Van der Korput’s result that the primes contain infinitely many 3-terms arithmetic progression as well as Green’s 2005 result asserting it for every  dense subset of primes.

Szemeredi’s celabrated result extended Roth’s theorem to arithmetic progression of any fixed size, and Green-Tao celebrated 2008 result asserts that the primes (or a dense subsets of primes) contain arithmetic progression of any length. (The case of 3-term AP is so far much simpler for all the results mentioned below.)


A little more about the history of the problem below the fold

Continue reading

Posted in Algebra, Combinatorics, Updates | Tagged , | 20 Comments

To cheer you up in difficult times 6: Play Rani Sharim’s two-player games of life, read Maya Bar-Hillel presentation on catching lies with statistics, and more.

Sorry for the long blog silence. In this post I wish to give a few links each of which probably deserves a full post. I will start with

Rani Sharim’s two-player variants of John Conway’s game-of-life

Here is a web-page by a student in my “Game Theory” course where you can play several two-players variants of John Conway’s game-of-life.  (A couple of variants were considered before. See, e.g. this  paper.)

I really enjoyed playing Rani’s games and it can certainly cheer you up in difficult times. Questions about the game, remarks, and suggestions for improvements and for new features, are most welcome.

How to catch lies with statistics, a 2006  presentation by Maya Bar-Hillel

Maya Bar-Hillel (left) and Ester Samuel-Cahn

Here is an interesting 2006 power point presentation entitled How to detect lies with statistics by Maya Bar-Hillel.  This was a talk given by Maya at the conference honoring Prof. Ester Samuel- Cahn , Jerusalem, December 18-20, 2006, and it described a planned research project of Maya Bar-Hillel with Yossi Rinott, David Budescu and myself. At the end we did not pursue it, mainly because each of us was involved in various other projects (but also because we were skeptical about some aspects of it.)  Ester Samuel-Cahn (1933-2015) was a famous Israeli statistician. (Here is a post by Yosi Levy in Hebrew about the conference and about Ester.)

The lecture starts with “Last year, Statistical Science celebrated 50 years for `How to Lie with Statistics’ the book [by Durell Huff] whose title inspired this talk.”

And here are a few other quotes from Maya’s presentation

“We are not sure a general toolkit for detecting lies with statistics can be developed. Perhaps that explains why none yet exists.  We have shown just a collection of anecdotes. But they can be classified and categorized. Some do seem generalizeable, at least to some extent.”

and the conclusion

A famous quip by Fred Mosteller: “It is easy to lie with statistics, but easier to lie without them.”

Likewise, we should say:  “It is possible to detect (some) lies with statistics, but easier to detect them with other means”.

New bounds for Ryser’s conjecture and related problems

Peter Keevash, Alexey Pokrovskiy, Benny Sudakov, and Liana Yepremyan’s paper  New bounds for Ryser’s conjecture and related problems, describes remarkable progress very old questions regarding transversals in Latin square.

Topological Tverberg news

I came across a very interesting paper The topological Tverberg problem beyond prime powers by Florian Frick and Pablo Soberón with new bounds and a new method for topological Tverberg theorem in the non prime-power case.

Jeager’s conjecture refuted

A year ago I came across this cool facebook post by Rupei Xu

OMG! Just learned that Jaeger’s conjecture is false for every t>=3. An interesting consequence of it is that a specific version of Goddyn’s conjecture on thin spanning trees is false, which shows some negative evidence that the thin spanning tree approaches may fail to lead to a constant factor approximation algorithm for ATSP!

A cool rainbow post Short rainbow cycles in graphs and matroids by Tony Huynh

More on the game of life

Let me mention two problems I posted 6-7 years ago about Conway’s game of life. Conway’s game of life for random initial position and Does a noisy version of Conway’s game of life support universal computation?



Posted in Combinatorics, Games, Rationality | Tagged , | Leave a comment

To cheer you up in difficult times 5: A New Elementary Proof of the Prime Number Theorem by Florian K. Richter

Here is a piece of news that will certainly cheer you up: Florian Richter found A new elementary proof of the prime number theorem. (I thank Tami Ziegler for telling me about the new result.)

From left to right: Atle Selberg, Florian Richter, and Paul Erdos

From the Introduction

“One of the most fundamental results in mathematics is the Prime Number Theorem, which describes the asymptotic law of the distribution of prime numbers in the integers.

Prime Number Theorem. Let π(N) denote the number of primes smaller or equal to a positive integer N. Then

\lim_{N \to \infty}   \frac {\pi (N)}{N/\log N}~=~1

The Prime Number Theorem was conjectured independently by Gauß and Legendre towards the end of the 18th century and was proved independently by Hadamard and de la Vallée Poussin in the year 1896. Their proofs are similar and rely on sophisticated analytic machinery involving complex analysis, which was developed throughout the 19th century by the combined effort of many great mathematicians of this era, including Euler, Chebyshev, and Riemann (to name a few).

Even though it was believed for a long time not to be possible, more than 50 years after the analytic proof of the Prime Number Theorem was discovered, an elementary proof was found by Erdős and Selberg [Erd49; Sel49]. In this context, elementary refers to methods that avoid using complex analysis and instead rely only on rudimentary facts from calculus and basic arithmetic identities and inequalities. Their approach was based on Selberg’s famous “fundamental formula”:

\sum_{p \le x}\log ^2(p)~+~\sum_{pq \le x}\log (p)\log (q)~=~2x\log (x)+O(x).

In this paper we provide a new elementary proof of the Prime Number Theorem, using ideas inspired by recent developments surrounding Sarnak’s and Chowla’s conjecture. With the exception of Stirling’s approximation formula, which is used in Section 3 without giving a proof, our proof is self-contained.”

The introduction also cites papers on the history of the analytic method and an abridged version of it by Newman, on several earlier elementary proofs of the PNT, and on recent dynamically inspired way by McNamara of deriving the Prime Number Theorem from Selberg’s fundamental formula.

Some thoughts and remarks

What precisely is elementary, formally?

What is the formal distinction between elementary and non-elementary proofs of the PNT? (This was explained to me several times but I don’t remember the details.) The distinctions between “non-explicit” and “explicit” proofs (or constructions) and between “effective” and “non-effective” proofs are quite clear cut. But here I am less certain.

Are there other important distinctions about proofs that you are aware of?

PNT and Boolean functions (weighted majority functions)

An equivalent formulation of the prime number theorem is the following:

(*)~~~ \sum_{k \le n}\mu (k)~=~o(n),

where \mu is the Mobius function. Consider now a Boolean function f(x_1,x_2, \dots, x_n) with n variables. Here the variables x_i and the function f has values in \{0,1\}. Given n nonnegative weights w_1\le w_2 \le \cdots \le w_n and C>0 the function $f$ is a weighted majority function if f=1 iff w_1x_1+\cdots +w_nx_n \le C. Let p(f) be the probability that f=1, and let \hat f[n] be the top Fourier coefficient of f.


(1) Find conditions for a weighted majority function f that guarantee that

(**)~~~ \hat f([n])~=~o( p(f))

Let p_k be the kth prime, w_k = \log p_k, and C=\log n. The PNT asserts that the Boolean function associated to these parameters satisfies (**). (It is very easy to see that (**) and (*) are essentially the same.)  From vague understanding of some fragments of older elementary proofs of the PNT it looks that the proofs could perhaps be described in this way, namely starting with some weaker properties on the weights (namely the primes)  perhaps the derivation of (**) apply to general weighted majority functions with these properties. (This is related to PNT for systems of Beurling primes.) Finding conditions for (**) is interesting also for other weighted majority functions where the weights are not related to the growth behavior of the primes.  In fact, the following is also interesting:

(2) Find conditions for (general) Boolean functions f that guarantee that (**) holds, namely that the top Fourier coefficients is little o of p(f). Signs of low degree polynomials form a very interesting class of  Boolean functions, and they are of special interest also here.

Remark: The first problem looks very related to PNT for systems of Beurling primes going back to 1937 paper by Arne Beurling. (Based on the inequality 1937 < 1949, I understand that Breuling’s proof was not elementary, and I don’t know if the elementary proofs for PNT gives Breuling’s theorem as well.)

PNT here on the blog and Mobius randomness

The new paper is related to conjectures by Chowla and Sarnak and to the notion of Mobius randomness that we considered in an earlier post. The basic question is for which functions f of k, {\bf (***)} \sum_{k \le n} \mu (k)f(k) = o(\sum_{k \le n}f(k)). This is correct if f is random (or random-like in a variety of meanings), and it turns out to be correct if (in some sense) if f is of low complexity. We also mentioned some analogs of the PNT in this post. Both polymath 4 and polymath5 had some connections to the PNT.

Is PNT to expanders like RH to Ramanujan graphs?

This is another connection to combinatorics, See this post in “In Theory”.

Posted in Number theory, Updates | Tagged , , | 6 Comments

To cheer you up in difficult times 4: Women In Theory present — I will survive

An amazing video

(Update, May18 2020). I failed to explain what WIT is and this may have caused some misunderstanding. Here is a description from the Simons Institute site.

“The Women in Theory (WIT) Workshop is intended for graduate and exceptional undergraduate students in the area of theory of computer science. The workshop will feature technical talks and tutorials by senior and junior women in the field, as well as social events and activities. The motivation for the workshop is twofold. The first goal is to deliver an invigorating educational program; the second is to bring together theory women students from different departments and foster a sense of kinship and camaraderie.”

Continue reading

Posted in Academics, Combinatorics, Computer Science and Optimization, Convexity, Games, Philosophy, Poetry, What is Mathematics, Women in science | 14 Comments

To cheer you up in difficult times 3: A guest post by Noam Lifshitz on the new hypercontractivity inequality of Peter Keevash, Noam Lifshitz, Eoin Long and Dor Minzer

This is a guest post kindly contributed by Noam Lifshitz.

My short introduction: There is nothing like a new hypercontractivity inequality to cheer you up in difficult times and this post describes an amazing new hypercontractivity inequality.  The post describes a recent hypercontractive inequality by Peter Keevash, Noam Lifshitz, Eoin Long and Dor Minzer (KLLM) from their paper: Hypercontractivity for global functions and sharp thresholds. (We reported on this development in this post. By now, there are quite a few important applications.) And for Talagrand’s generic chaining inequality, see this beautiful blog post by Luca Trevisan.

Barry Simon coined the term “hypercontractivity” in the 70s.  (We asked about it here and Nick Read was the first to answer.) A few months ago Barry told us about the early history of hypercontractivity inequalities, and, in particular, the very entertaining story on William Beckner’s Ph. D. qualifying exam.

And now to Noam Lifshitz’s guest post.

Hypercontractivity on product spaces

Analysis of Boolean functions (ABS) is a very rich subject. There are many works whose concern is generalising some of the results on analysis of Boolean functions to other (product) settings, such as functions on the multicube {\left[m\right]^{n},} where {m} is very large. However, in some of these cases the fundemental tools of AOBF seem to be false for functions on the multicube {f\colon\left[m\right]^{n}\rightarrow\mathbb{R}.} However, in the recent work of Keevash, Long, Minzer, and I. We introduce the notion of global functions. These are functions that do not strongly depend on a small set of coordinates. We then show that most of the rich classical theory of AOBF can in fact be generalised to these global functions. Using our machinery we were able to strengthen an isoperimetric stability result of Bourgain, and to make progress on some Erdos-Ko-Rado type open problem.

We now discuss some background on the Fourier analysis on functions on the multicube {f\colon\left\{ 1,\ldots,m\right\} ^{n}\rightarrow\mathbb{R}.}

Derivatives and Laplacians

There are two fundemental types of operators on Boolean functions {f\colon\left\{ 0,1\right\} ^{n}\rightarrow\mathbb{R}.} The first ones are the discrete derivatives, defined by {D_{i}[f]=\frac{f_{i\rightarrow1}-f_{i\rightarrow0}}{2},} where {f_{i\rightarrow x}} denotes the we plug in the value {x} for the {i}th coordinate. The other closely related ones are the laplacians defined by {L_{i}f\left(x\right):=f\left(x\right)-\mathbb{E}f\left(\mathbf{y}\right),} where {\mathbf{y}} is obtained from {x} by resampling its {i}th coordinate.

The laplacians and the derivatives are closely related. In fact, when we plug in {1} in the {i}th coordinate, we obtain {L_{i}[f]_{i\rightarrow1}=D_{i}[f]}, and when we plug in {0} in it, we obtain {L_{i}[f]_{i\rightarrow0}=-D_{i}[f].}

The 2-norm of the {i}th derivative is called the {i}th influence of {f} as it measures the impact of the {i}th coordinate on the value of {f}. It’s usually denoted by {\mathrm{Inf}_{i}[f]}.

Generalisation to functions on the multicube

For functions on the multicube we don’t have a very good notion of a discrete derivative, but it turns out that it will be enough to talk about the laplacians and their restrictions. The Laplacians are again defined via {L_{i}f\left(x\right):=f\left(x\right)-\mathbb{E}f\left(\mathbf{y}\right),} where {\mathbf{y}} is obtained from {x} by resampling its {i}th coordinate. It turns out that in the continuous cube it’s not enough to talk about Laplacians of coordinate, and we will also have to concern ourselves with Laplacians of sets. We define the generalised Laplacians of a set {S} by composing the laplacians corresponding to each coordinate in {S} {L_{\left\{ i_{1},i_{2},\ldots,i_{r}\right\} }\left[f\right]:=L_{i_{1}}\circ\cdots\circ L_{i_{r}}\left[f\right].}

We now need to convince ourselves that these laplacians have something to do with the impact of {S} on the outcome of {f.} In fact, the following notions are equivalent

  1. For each {x,y\in\left[m\right]^{S}}we have {\|f_{S\rightarrow x}-f_{S\rightarrow y}\|_{2}<\delta_{1}}
  2. For each {x\in\left[m\right]^{S}} we have {\|L_{S}[f]_{S\rightarrow x}\|_{2}<\delta_{2},}

in the sense that if (1) holds then (2) holds with {\delta_{2}=C^{\left|S\right|}\delta_{1}} and conversely if (2) holds, then (1) holds with {\delta_{1}=C^{\left|S\right|}\delta_{2}.}

The main theme of our work is that one can understand global function on the continuous cube, and these are functions that satisfy the above equivalent notions for all small sets {S}.

Noise operator, hypercontractivity, and small set expansion

For {\rho\in\left(0,1\right),} the noise operator is given by {\mathrm{T}_{\rho}\left[f\right]\left(x\right)=\mathbb{E}f\left(\mathbf{y}\right)} when {\mathbf{y}} is obtained from {x} by independently setting each coordinate {\mathbf{y}_{i}} to be {\mathbf{x}_{i}} with probability {\rho} and resampling it with uniformly out of {\left\{ -1,1\right\} } otherwise. The process which given {x} outputs {\mathbf{y}} is called the {\rho}-noisy process, and we write {\mathbf{y}\sim N_{\rho}\left(x\right).}

The Bonami hypercontractivity theorem, which was then generalised by Gross and Beckner states that the noise operator {T_{\frac{1}{\sqrt{3}}}} is a contraction from {L^{2}\left(\left\{ 0,1\right\} ^{n}\right)} to {L^{4}\left(\left\{ 0,1\right\} ^{n}\right),} i.e.

\displaystyle \|\mathrm{T}_{\frac{1}{\sqrt{3}}}f\|_{4}\le\|f\|_{2}
for any function {f.}

One consequence of the hypercontractivity theorem is the small set expansion theorem of KKL. It concerns fixed {\rho\in\left(0,1\right)} and a sequence of sets {A_{n}\subseteq\{0,1\}^{n}} with {\left|A_{n}\right|=o\left(2^{n}\right).} The small set expansion theorem states that if we choose {\mathbf{x}\sim A_{n}} uniformly and a noisy {\mathbf{y}\sim N_{\rho}\left(\mathbf{x}\right),} then {\mathbf{y}} will reside outside of {A_{n}} almost surely.

The Generalisation to the multicube:

The small set expansion theorem and the hypercontractivity theorem both fail for function on the multicube that are of a very local nature. For instance, let {A} be the set of all {x\in\left\{ 1,\ldots,m\right\} ^{n},} such that {x_{1}} is {m.} Then {A} is of size {m^{n-1},} which is {o\left(m^{n}\right)} if we allow {m} to be a growing function of {n}. However, the {\rho}-noisy process from the set stays within the set with probability {\rho.} For a similar reason the hypercontractivity theorem fails as is for functions on {\left\{ 1,\ldots,m\right\} ^{n}.} However we were able to generalise the hypercontractivity theorem by taking the globalness of {f} into consideration.

Our main hypercontractive inequality is the following

Theorem 1.

\displaystyle \|\mathrm{T}_{\frac{1}{100}}f\|_{4}^{4}\le\sum_{S\subseteq\left[n\right]}\mathbb{E}_{\mathbf{x}\sim\left\{ 1,\ldots,m\right\} ^{m}}\left(\|L_{S}\left[f\right]_{S\rightarrow\mathbf{x}}\|_{2}^{4}\right).

The terms {\|L_{S}\left[f\right]_{S\rightarrow x}\|_{2}} appearing on the right hand side are small whenever {f} has a small dependency on {S} and it turns out that you have the following corrolary of it, which looks a bit more similar to the hypercontractive intequality.


Corollary 2.

Let {f\colon\left\{ 1,\ldots,m\right\} ^{n}\rightarrow\mathbb{R}}, and uppose that {\|L_{S}[f]_{S\rightarrow x}\|_{2}\le4^{\left|S\right|}\|f\|_{2}} for all sets {S.}

Then {\mathrm{\|\mathrm{T}_{\frac{1}{1000}}f\|_{4}\le\|f\|_{2}.}}

Finally, one might ask wonder why this globalness notion appears only when we look at large values of {m} and not when {m=2.} I think the corollary is a good explanation for that as {\|f\|_{2}^{2}\ge\left(\frac{1}{2}\right)^{\left|S\right|}\|f_{S\rightarrow x}\|_{2}^{2}} holds trivially for any Boolean function {f\colon\left\{ 0,1\right\} ^{n}\rightarrow\mathbb{R}.}

Posted in Analysis, Combinatorics, Computer Science and Optimization, Guest post, Poetry, Probability | Tagged , , , , | 4 Comments

Harsanyi’s Sweater

Today is the Holocaust Remembrance Day in Israel.  Here is a moving story from the  paper  about  John Harsanyi,  Harsanyi’s Sweater, by Robert J. Aumann.

It is 1944 in Budapest, and John is in his early twenties. He has been taken for deportation, with all that that implies. Arriving at the railroad station, he puts his knapsack down and wanders off a few yards, under the watchful eye of a guard. Then the guard is distracted for a moment, and John sees his chance to escape. But in the knapsack there is a beautiful, warm sweater, lovingly hand-knitted for John by his mother. John hesitates; should he — can he — abandon the sweater? After a moment, the urge to live takes over, and he slips away, taking refuge in a convent by previous arrangement. He survives the Holocaust to become the great thinker that he becomes. Hesitant, careful, open-minded, undogmatic — and in spite of that, or perhaps because of that — great.

Aumann himself  was born in Frankfurt, Germany, and fled at the age of eight to the United States with his family in, two weeks before the 1938 Kristallnacht.

I plan to mention this story in my game theory class this evening and also to mention five great game theorists named John.

Continue reading

Posted in Games, Teaching | Tagged , | Leave a comment

To cheer you up in difficult times II: Mysterious matching news by Gal Beniamini, Naom Nisan, Vijay Vazirani and Thorben Tröbst!

Matching is one of the richest gold mines for ideas and results in mathematics, computer science and other areas.  Today I want to briefly tell you about a curious, surprising, mysterious, and cheerful recent result by Gal Beniamini and Noam Nisan and a subsequent work of Vijay Vazirani. It is a result that will cheer up combinatorialists on both sides of the aisle: graph theorists and researchers in extremal and probabilistic combinatorics as well as algebraic and enumerative combinatorialists.  (And it is related to query complexity, Eulerian lattices, Birkhoff’s polytope, a theorem of Lou Billera and Aravamuthan Sarangarajan, evasiveness, analysis of Boolean functions, and various other things.) At the end of the post I will remind you of a central problem in matching theory: that of extending Lovasz’ randomized algorithm for matching to general graphs. (Perhaps methods from algebraic combinatorics can help.)

I will start with sad news. John Horton Conway, an amazing mathematical hero,  passed away a few days ago. There are very nice posts on Conway’s work by Scott Aaronson (with many nice memories in the comments section), by Terry Tao, and by Dick Lipton and Ken Regan. And a moving obituary on xkcd with a touch of ingenuity of Conway’s style. (There is also a question on MO  “Conway’s less known results,” and two questions on the game of life (I, II).)

Another reading material to cheer you up is my paper: The argument against quantum computers, the quantum laws of nature, and Google’s supremacy claims. It is for Laws, Rigidity and Dynamics, Proceedings of the ICA workshops 2018 & 2019 in Singapore and Birmingham. Remarks are most welcome. (Update (May, 25 2020): An interesting discussion on Hacker news.)

Update: starting today, the algebraic combinatorics online workshop.  Here is the schedule for the first week, and for the second week.


Matching theory by Lovasz and Plummer is probably one of the best mathematics books ever written. 

Bipartite Perfect Matching as a Real Polynomial

Bipartite Perfect Matching as a Real Polynomial, by Gal Beniamini and Noam Nisan

Abstract: We obtain a description of the Bipartite Perfect Matching decision problem as a multilinear polynomial over the Reals. We show that it has full degree and (1-o_n(1))\cdot 2^{n^2}  monomials with non-zero coefficients. In contrast, we show that in the dual representation (switching the roles of 0 and 1) the number of monomials is only exponential in \Theta(n \log n)  Our proof relies heavily on the fact that the lattice of graphs which are “matching-covered” is Eulerian.

And here is how the paper starts

Every Boolean function f:\{0,1\}^n\to\{0,1\} can be represented in a unique way as a Real multilinear polynomial. This representation and related ones (e.g. using the {1,−1} basis rather than {0,1}– the “Fourier transform” over the hypercube, or approximation variants) have many applications for various complexity and algorithmic purposes. See, e.g., [O’D14] for a recent textbook. In this paper we derive the representation of the bipartite-perfect-matching decision problem as a Real polynomial.

Definition. The Boolean function BPM_n(x_{1,1},\dots,x_{n,n}) is defined to be 1 if and only if the bipartite graph whose edges are\{(i,j):x_{i,j}=1\} has a perfect matching, and 0 otherwise.

And here are the two main theorems regarding this polynomial and the polynomial for the dual representation:

(For the second theorem you need the notion of totally ordered bipartite graphs.)

And here is a nice picture!

A very interesting open problem is:

Problem: Can the Beniamini-Nisan results be extended to general (non-bipartite) graphs

This reminds me of an old great problem:

Problem: Does Lovasz’ randomized algorithm for matching extend to the non-bipartite case?

For both problems methods of algebraic combinatorics may be helpful.

An Extension by Vijay Vazirani and Thorben Tröbst

A Real Polynomial for Bipartite Graph Minimum Weight Perfect Matchings, Thorben Tröbst, Vijay V. Vazirani


In a recent paper, Beniamini and Nisan gave a closed-form formula for the unique multilinear polynomial for the Boolean function determining whether a given bipartite graph G \subset K_{n,n} has a perfect matching, together with an efficient algorithm for computing the coefficients of the monomials of this polynomial. We give the following generalization: Given an arbitrary non-negative weight function w on the edges of K_{n,n}, consider its set of minimum weight perfect matchings. We give the real multilinear polynomial for the Boolean function which determines if a graph G \subset K_{n,n} contains one of these minimum weight perfect matchings.

Three more remarks about VVV

Three more VVV remarks: in the Tel Aviv theory fast fest three months ago (it seems like ages ago) Vijay Vazirani gave a lecture about matching. Here is the link to Vijay’s lecture, and to all plenary lectures.  At the end, I asked him how he explains that matching theory is such inexhaustable gold mine and Vijay mentioned the fact that a polynomial-time algorithm for the assignment problem (which is closely related to matching) was already found by Jacobi in 1890. (Unfortunately VJ’s inspiring answer was not recorded). A few years ago Vijay published a simplified proof of a fantastic famous result he first proved with Silvio Micaly 34 years earlier. And here is a most amazing story: a few years ago I went to the beach in Tel Aviv and I discovered Vijay swimming just next to me.  We were quite happy to see each other and Vijay told me a few things about matching, economics and biology. This sounds now like a truly surrealistic story, and perhaps we even shook hands.




Posted in Combinatorics, Computer Science and Optimization | Tagged , , , | 10 Comments