More around Borsuk

Piotr Achinger told me two things abour Karol Borsuk:

 

From Wikipedea: Dunce hat Folding. The blue hole is only for better view

Borsuk trumpet

is  another name for the contractible non-collapsible space commonly called also the “dunce hat“. (See also this post.) For a birthday conference of Borsuk, a cake of this shape was baked and served.

 Hodowla zwierzątek

(polish, in English: Animal Husbandry) is (from Wikipedea, here is the link) a dice game invented and published by Karol Borsuk at his own expense in 1943, during the German occupation of Warsaw. Sales of the game were a way for Borsuk to support his family after he lost his job following the closure by the German occupation authorities of Warsaw University. The original sets were produced by hand by Borsuk’s wife, Zofia. The author of the drawings of animals was Janina Borsuk née Śliwicka. The game was one of the first in the world to feature a 12-sided dice.

Analysis of Boolean Functions – Week 7

Lecture 11

The Cap Set problem

We presented Meshulam’s  bound 3^n/n for the maximum number of elements in a subset A of (\mathbb{Z}/3Z)^n not containing a triple x,y,x of distinct elements whose sum is 0.

The theorem is analogous to Roth’s theorem for 3-term arithmetic progressions and, in fact, it is a sort of purified analog to Roth’s proof, as some difficulties over the integers are not presented here.  There are two ingredients in the proof: One can be referred to as the “Hardy-Littlewood circle method” and the other is the “density increasing” argument.

We first talked about density-increasing method and showed how KKL’s theorem for influence of sets follows from KKL’s theorem for the maximum individual influence. I mentioned what is known about influence of large sets and what is still open. (I will devote to this topic a separate post.)

Then we went over Meshulam’s proof in full details. A good place to see a detailed sketch of the proof is in this post  on Gowers’ blog.

Let me copy Tim’s sketch over here:

Sketch of proof (from Gowers’s blog).

Next, here is a brief sketch of the Roth/Meshulam argument. I am giving it not so much for the benefit of people who have never seen it before, but because I shall need to refer to it. Recall that the Fourier transform of a function f:\mathbb{F}_3^n\to\mathbb{C} is defined by the formula

\hat{f}(r)=\mathbb{E}f(x)\omega^{r.x},

where \mathbb{E} is short for 3^{-n}\sum, \omega stands for \exp(2\pi i/3) and r.x is short for \sum_ir_ix_i. Now

\mathbb{E}_{x+y+z=0}f(x)f(y)f(z)=f*f*f(0).

(Here \mathbb{E} stands for 3^{-2n}\sum, since there are 3^{2n} solutions of x+y+z=0.) By the convolution identity and the inversion formula, this is equal to \sum_r\hat{f}(r)^3.

Now let f be the characteristic function of a subset A\subset\mathbb{F}_3^n of density \delta. Then \hat{f}(0)=\delta. Therefore, if A contains no solutions of x+y+z=0 (apart from degenerate ones — I’ll ignore that slight qualification for the purposes of this sketch as it makes the argument slightly less neat without affecting its substance) we may deduce that

\sum_{r\ne 0}|\hat{f}(r)|^3\geq\delta^3.

Now Parseval’s identity tells us that

\sum_r|\hat{f}(r)|^2=\mathbb{E}_x|f(x)|^2=\delta,

from which it follows that \max_{r\ne 0}|\hat{f}(r)|\geq\delta^2.

Recall that \hat{f}(r)=\mathbb{E}_xf(x)\omega^{r.x}. The function x\mapsto\omega^{r.x} is constant on each of the three hyperplanes r.x=b (here I interpret r.x as an element of \mathbb{F}_3). From this it is easy to show that there is a hyperplane H such that \mathbb{E}_{x\in H}f(x)\geq\delta+c\delta^2 for some absolute constant c. (If you can’t be bothered to do the calculation, the basic point to take away is that if \hat{f}(r)\geq\alpha then there is a hyperplane perpendicular to r on which A has density at least \delta+c\alpha, where c is an absolute constant. The converse holds too, though you recover the original bound for the Fourier coefficient only up to an absolute constant, so non-trivial Fourier coefficients and density increases on hyperplanes are essentially the same thing in this context.)

Thus, if A contains no arithmetic progression of length 3, there is a hyperplane inside which the density of A is at least \delta+c\delta^2. If we iterate this argument 1/c\delta times, then we can double the (relative) density of A. If we iterate it another 1/2c\delta times, we can double it again, and so on. The number of iterations is at most 2/c\delta, so by that time there must be an arithmetic progression of length 3. This tells us that we need lose only 2/c\delta dimensions, so for the argument to work we need n\geq 2/c\delta, or equivalently \delta\geq C/n.

 

Lecture 12

Error-Correcting Codes

We discussed error-correcting codes. A binary code C is simply a subset of the discrete n-dimensional cube. This is a familiar object but in coding theory we asked different questions about it. A code is linear if it forms a vector space over (Z/2Z)^n. The minimal distance of a code is the minimum Hamming distance between two distinct elements, and in the case of linear codes it is simply the minimum weight of a non-zero element of the codes. We mentioned codes over larger alphabets, spherical codes and even codes in more general metric spaces. Error-correcting codes are among the most glorious applications of mathematics and their theory is related to many topics in pure mathematics and theoretical computer science.

1) An extremal problem for codes: What is the maximum size of a binary code of length n with minimal distance d. We mentioned the volume (or Hamming) upper bound and the Gilbert-Varshamov lower bound. We concentrated on the case of codes of positive rate.

2) Examples of codes: We mentioned the Hamming code and the Hadamard code and considered some of their basic properties. Then we mentioned the long code which is very important in the study of Hardness of computation.

3) Linearity testing. Linearity testing is closely related to the Hadamard code. We described Blum-Luby-Rubinfeld linearity test and analyzed it. This is very similar to the Fourier theoretic formula and argument we saw last time for the cap problem.

We start to describe Delsartes linear Programming method to be continued next week.

Analysis of Boolean Functions week 5 and 6

Lecture 7

First passage percolation

1)  Models of percolation.

We talked about percolation introduced by Broadbent and Hammersley in 1957. The basic model is a model of random subgraphs of a grid in n-dimensional space. (Other graphs were considered later as well.) Here, a grid is a graph whose vertices have integers coordinates and where two vertices are adjacent if their Euclidean distance is one. Every edge of the grid-graph is taken (or is “open” in the percolation jargon) with the same probability p, independently. We mentioned some basic questions – is there an infinite component? How many infinite components are there? What is the probability that the origin belongs to such an infinite component as a function of p?

I mentioned two results: The first  is Kesten’s celebrated result that the critical probability for planar percolation is 1/2. The other by Burton and Keane is that in very general situations almost surely there is a unique infinite component or none at all. This was a good point to mention a famous conjecture- The dying percolation conjecture (especially in dimension 3) which asserts that at the critical probability there is no infinite component.

We will come back to this basic model of percolation later in the course, but for now we moved to a related more recent model.

2) First passage percolation

We talked about first passage percolation introduced by Hammersley and Welsh in 1965. Again we consider the infinite graph of a grid and this time we let the length of every edge be 1 with probability 1/2 and 2 with probability 1/2 (independently). These weights describe a random metric on this infinite graph that we wish to understand. We consider two vertices (0,0) and (v,0) (for high dimension the second entry can account for a (d-1) dimensional vectors, but we can restrict our attention to d=2) and we let D(x) be the distance between these two vectors. We explained how D is an integer values function on a discrete cube with Liphshitz constant 1. The question we want to address is : What is the variance of D?

Why do we study the variance, when we do not know exactly the expectation, you may ask? (I remember Lerry Shepp asking this when I talked about it at Bell Labs in the early 90s.) One answer is that we know that the expectation of D is linear, and for the variance we do not know how it behaves. Second, we expect that telling the expectation precisely will depend on the model while the way the variance grows and perhaps D‘s limiting distribution, will be universal (say, for dimension 2). And third, we do not give up on the expectation as well. 

Here is what we showed:

1) From the inequality var(D)=\sum_{S\ne \emptyset}\hat D^2(S)\le\sum \hat D^2(S)|S| we derived Kesten’s bound var (D) =O(v).

2) We considered the value s so that \mu(D>s)=t, and showed by the basic inequality above that the variance of D conditioned on D>s is also bounded by v. This corresponds to exponential tail estimate proved by Kesten.

3) Using hypercontractivity we showed that the variance of D conditioned on D>s is actually bounded above by v/log (1/t) which corresponds to Talagrand’s sub-Gaussian tail-estimate.

4) Almost finally based on a certain very plausible lemma we used hypercontructivity to show that most Fourier coefficients of D are above the log v level, improving the variance upper bound to O(v/log v).

5) Since the plausible lemma is still open (see this MO question) we showed how we can “shortcut” the lemma and prove the upper bound without it.

The major open question

It is an open question to give an upper bound of v^{1-\epsilon} or even v^{2/3} which is the expected answer in dimension two. Michel Ledoux wisely proposes to prove it just for directed percolation in the plane (where all edges are directed up and right) from (0,0) to (v,v) where the edge length is Gaussian or Bernoulli.

Lecture 8

Three Further Applications of Discrete Fourier Analysis (without hypercontractivity)

The three next topics will use Fourier but not hypercontractivity. We start by talking about them.

1) The cap-set problem, some perspective and a little more extremal combinatorics

We talked about Roth theorem, the density Hales Jewett theorem,  the Erdos-Rado delta-system theorem and conjecture. We mentioned linearity testing.

2) Upper bounds for error-correcting codes

This was a good place to mention (and easily prove) a fundamental property used in both these cases:  The Fourier transform of convolutions of two functions f and g is the product of the Fourier transform of f and of g.

3) Social choice and Arrow’s theorem

The Fourier theoretic proof for Arrow’s theorem uses only Parseval’s formula so we are going to start with that.

Fourier-theoretic proof of Arrows theorem and related results.

We talked a little about Condorcet(we will later give a more detailed introduction to social choice). We mentioned Condorcet’s paradox, Condorcet’s Jury Theorem, and the notion of Condorcet winner.

Next we formulated Arrow’s theorem.  Lecture 9 was devoted to a Fourier-theoretic proof of Arrow theorem (in the balanced case). You can find it discussed in this blog post by Noam Nisan.  Lecture 10 mentioned a few further application of the Fourier method related to Arrow’s theorem, as well as a simple combinatorial proof of Arrow’s theorem in full generality. For the Fourier proof of Arrow’s theorem we showed that a Boolean function with all its non-zero Fourier coefficients on levels 0 and 1 is constant, dictatorship or anti-dictatorship. This time we formulated FKN theorem and showed how it implies a stability version of Arrow’s theorem in the neutral case.