# Analysis of Boolean Functions – Week 7

## The Cap Set problem

We presented Meshulam’s  bound $3^n/n$ for the maximum number of elements in a subset A of $(\mathbb{Z}/3Z)^n$ not containing a triple x,y,x of distinct elements whose sum is 0.

The theorem is analogous to Roth’s theorem for 3-term arithmetic progressions and, in fact, it is a sort of purified analog to Roth’s proof, as some difficulties over the integers are not presented here.  There are two ingredients in the proof: One can be referred to as the “Hardy-Littlewood circle method” and the other is the “density increasing” argument.

We first talked about density-increasing method and showed how KKL’s theorem for influence of sets follows from KKL’s theorem for the maximum individual influence. I mentioned what is known about influence of large sets and what is still open. (I will devote to this topic a separate post.)

Then we went over Meshulam’s proof in full details. A good place to see a detailed sketch of the proof is in this post  on Gowers’ blog.

Let me copy Tim’s sketch over here:

Sketch of proof (from Gowers’s blog).

Next, here is a brief sketch of the Roth/Meshulam argument. I am giving it not so much for the benefit of people who have never seen it before, but because I shall need to refer to it. Recall that the Fourier transform of a function $f:\mathbb{F}_3^n\to\mathbb{C}$ is defined by the formula

$\hat{f}(r)=\mathbb{E}f(x)\omega^{r.x},$

where $\mathbb{E}$ is short for $3^{-n}\sum,$ $\omega$ stands for $\exp(2\pi i/3)$ and $r.x$ is short for $\sum_ir_ix_i.$ Now

$\mathbb{E}_{x+y+z=0}f(x)f(y)f(z)=f*f*f(0).$

(Here $\mathbb{E}$ stands for $3^{-2n}\sum,$ since there are $3^{2n}$ solutions of $x+y+z=0.$) By the convolution identity and the inversion formula, this is equal to $\sum_r\hat{f}(r)^3.$

Now let $f$ be the characteristic function of a subset $A\subset\mathbb{F}_3^n$ of density $\delta.$ Then $\hat{f}(0)=\delta.$ Therefore, if $A$ contains no solutions of $x+y+z=0$ (apart from degenerate ones — I’ll ignore that slight qualification for the purposes of this sketch as it makes the argument slightly less neat without affecting its substance) we may deduce that

$\sum_{r\ne 0}|\hat{f}(r)|^3\geq\delta^3.$

Now Parseval’s identity tells us that

$\sum_r|\hat{f}(r)|^2=\mathbb{E}_x|f(x)|^2=\delta,$

from which it follows that $\max_{r\ne 0}|\hat{f}(r)|\geq\delta^2.$

Recall that $\hat{f}(r)=\mathbb{E}_xf(x)\omega^{r.x}.$ The function $x\mapsto\omega^{r.x}$ is constant on each of the three hyperplanes $r.x=b$ (here I interpret $r.x$ as an element of $\mathbb{F}_3$). From this it is easy to show that there is a hyperplane $H$ such that $\mathbb{E}_{x\in H}f(x)\geq\delta+c\delta^2$ for some absolute constant $c.$ (If you can’t be bothered to do the calculation, the basic point to take away is that if $\hat{f}(r)\geq\alpha$ then there is a hyperplane perpendicular to $r$ on which $A$ has density at least $\delta+c\alpha,$ where $c$ is an absolute constant. The converse holds too, though you recover the original bound for the Fourier coefficient only up to an absolute constant, so non-trivial Fourier coefficients and density increases on hyperplanes are essentially the same thing in this context.)

Thus, if $A$ contains no arithmetic progression of length 3, there is a hyperplane inside which the density of $A$ is at least $\delta+c\delta^2.$ If we iterate this argument $1/c\delta$ times, then we can double the (relative) density of $A.$ If we iterate it another $1/2c\delta$ times, we can double it again, and so on. The number of iterations is at most $2/c\delta,$ so by that time there must be an arithmetic progression of length 3. This tells us that we need lose only $2/c\delta$ dimensions, so for the argument to work we need $n\geq 2/c\delta,$ or equivalently $\delta\geq C/n.$

Lecture 12

## Error-Correcting Codes

We discussed error-correcting codes. A binary code C is simply a subset of the discrete n-dimensional cube. This is a familiar object but in coding theory we asked different questions about it. A code is linear if it forms a vector space over $(Z/2Z)^n.$ The minimal distance of a code is the minimum Hamming distance between two distinct elements, and in the case of linear codes it is simply the minimum weight of a non-zero element of the codes. We mentioned codes over larger alphabets, spherical codes and even codes in more general metric spaces. Error-correcting codes are among the most glorious applications of mathematics and their theory is related to many topics in pure mathematics and theoretical computer science.

1) An extremal problem for codes: What is the maximum size of a binary code of length n with minimal distance d. We mentioned the volume (or Hamming) upper bound and the Gilbert-Varshamov lower bound. We concentrated on the case of codes of positive rate.

2) Examples of codes: We mentioned the Hamming code and the Hadamard code and considered some of their basic properties. Then we mentioned the long code which is very important in the study of Hardness of computation.

3) Linearity testing. Linearity testing is closely related to the Hadamard code. We described Blum-Luby-Rubinfeld linearity test and analyzed it. This is very similar to the Fourier theoretic formula and argument we saw last time for the cap problem.

We start to describe Delsartes linear Programming method to be continued next week.

# Analysis of Boolean Functions week 5 and 6

## First passage percolation

1)  Models of percolation.

We talked about percolation introduced by Broadbent and Hammersley in 1957. The basic model is a model of random subgraphs of a grid in n-dimensional space. (Other graphs were considered later as well.) Here, a grid is a graph whose vertices have integers coordinates and where two vertices are adjacent if their Euclidean distance is one. Every edge of the grid-graph is taken (or is “open” in the percolation jargon) with the same probability p, independently. We mentioned some basic questions – is there an infinite component? How many infinite components are there? What is the probability that the origin belongs to such an infinite component as a function of p?

I mentioned two results: The first  is Kesten’s celebrated result that the critical probability for planar percolation is 1/2. The other by Burton and Keane is that in very general situations almost surely there is a unique infinite component or none at all. This was a good point to mention a famous conjecture- The dying percolation conjecture (especially in dimension 3) which asserts that at the critical probability there is no infinite component.

We will come back to this basic model of percolation later in the course, but for now we moved to a related more recent model.

2) First passage percolation

We talked about first passage percolation introduced by Hammersley and Welsh in 1965. Again we consider the infinite graph of a grid and this time we let the length of every edge be 1 with probability 1/2 and 2 with probability 1/2 (independently). These weights describe a random metric on this infinite graph that we wish to understand. We consider two vertices (0,0) and (v,0) (for high dimension the second entry can account for a (d-1) dimensional vectors, but we can restrict our attention to d=2) and we let D(x) be the distance between these two vectors. We explained how D is an integer values function on a discrete cube with Liphshitz constant 1. The question we want to address is : What is the variance of D?

Why do we study the variance, when we do not know exactly the expectation, you may ask? (I remember Lerry Shepp asking this when I talked about it at Bell Labs in the early 90s.) One answer is that we know that the expectation of D is linear, and for the variance we do not know how it behaves. Second, we expect that telling the expectation precisely will depend on the model while the way the variance grows and perhaps D‘s limiting distribution, will be universal (say, for dimension 2). And third, we do not give up on the expectation as well.

Here is what we showed:

1) From the inequality $var(D)=\sum_{S\ne \emptyset}\hat D^2(S)\le\sum \hat D^2(S)|S|$ we derived Kesten’s bound var (D) =O(v).

2) We considered the value s so that $\mu(D>s)=t$, and showed by the basic inequality above that the variance of D conditioned on D>s is also bounded by v. This corresponds to exponential tail estimate proved by Kesten.

3) Using hypercontractivity we showed that the variance of D conditioned on D>s is actually bounded above by v/log (1/t) which corresponds to Talagrand’s sub-Gaussian tail-estimate.

4) Almost finally based on a certain very plausible lemma we used hypercontructivity to show that most Fourier coefficients of D are above the log v level, improving the variance upper bound to O(v/log v).

5) Since the plausible lemma is still open (see this MO question) we showed how we can “shortcut” the lemma and prove the upper bound without it.

### The major open question

It is an open question to give an upper bound of $v^{1-\epsilon}$ or even $v^{2/3}$ which is the expected answer in dimension two. Michel Ledoux wisely proposes to prove it just for directed percolation in the plane (where all edges are directed up and right) from (0,0) to (v,v) where the edge length is Gaussian or Bernoulli.

### Lecture 8

Three Further Applications of Discrete Fourier Analysis (without hypercontractivity)

The three next topics will use Fourier but not hypercontractivity. We start by talking about them.

1) The cap-set problem, some perspective and a little more extremal combinatorics

We talked about Roth theorem, the density Hales Jewett theorem,  the Erdos-Rado delta-system theorem and conjecture. We mentioned linearity testing.

2) Upper bounds for error-correcting codes

This was a good place to mention (and easily prove) a fundamental property used in both these cases:  The Fourier transform of convolutions of two functions f and g is the product of the Fourier transform of f and of g.

3) Social choice and Arrow’s theorem

The Fourier theoretic proof for Arrow’s theorem uses only Parseval’s formula so we are going to start with that.

## Fourier-theoretic proof of Arrows theorem and related results.

We talked a little about Condorcet(we will later give a more detailed introduction to social choice). We mentioned Condorcet’s paradox, Condorcet’s Jury Theorem, and the notion of Condorcet winner.

Next we formulated Arrow’s theorem.  Lecture 9 was devoted to a Fourier-theoretic proof of Arrow theorem (in the balanced case). You can find it discussed in this blog post by Noam Nisan.  Lecture 10 mentioned a few further application of the Fourier method related to Arrow’s theorem, as well as a simple combinatorial proof of Arrow’s theorem in full generality. For the Fourier proof of Arrow’s theorem we showed that a Boolean function with all its non-zero Fourier coefficients on levels 0 and 1 is constant, dictatorship or anti-dictatorship. This time we formulated FKN theorem and showed how it implies a stability version of Arrow’s theorem in the neutral case.

# Analysis of Boolean Functions – week 4

### Lecture 6

Last week we discussed two applications of the Fourier-Walsh plus hypercontractivity method and in this lecture we will discuss one additional application:

The lecture was based on a 5-pages paper by Ehud Friedgut and Jeff Kahn: On the number of copies of one hypergraph in another  Israel Journal of Mathematics Vol. 105 (1998) pp. 251-256.

In this application our method  has nice but not optimal consequences, and another method – applying Shearer’s inequality, gives the optimal result.

1) The question: Given a hypegraph H with k vertices what is the maximum number of labeled copies of H inside a hypegraph G with $\ell$ edges.

There are cases where the answer is known with great precision. The Kruskal-Katona theorem gives the answer when H is the hypegraph whose edges are all the r-subsets of a vertex set V of size k.

In our study  we will fix H and care only about the asymptotic behavior as a function of $\ell$. We have a simple upper bound of $\ell^k$ and the question is to identify the correct exponent.

2) Stable sets and fractional stable sets. A stable set S in a hypergraph H (also called independent set) is a set of vertices so that every edge contains at most one vertex from S. The stable number $\alpha(H)$ of a hypergraph H is the maximum size of a stable set.

Now comes an idea which is very important in graph theory, of considering the linear programming relaxation of combinatorial parameters. A fractional stable set is an assignment of non negative weights to the vertices, so the sum of weights in every edge is at most one. So this is a “fuzzy set” of a kind the membership of a vertex to a set is described by a number in the interval [0,1] rather than the set {0,1}. The size of a fractional stable set is the sum of weights of all vertices. The fractional stable number $\alpha^*(H)$ is the minimum size of a fractional stable set.

3) Cover and fractional cover numbers

We next described covering and fractional covering (of vertices by edges) in hypergraphs, the covering number $\rho(H)$ of a hypergraph H and the fractional covering number $\rho^*(H)$. Linear programming duality gives that the fractional stable number  is equal to the fractional covering number.

Friedgut-Kahn theorem: The number of copies of H is a hypergraph G with $\ell$ edges is $O(\ell^{\rho^*(H)})$ and this bound is sharp.

The case of graphs was proved (in a different language) by Noga Alon in his M. Sc. thesis, and Noga’s first publication  On the number of subgraphs of prescribed type pf graphs with a given number of edges,( Israel J. Math. 38(1981), 116-130).) Part of the challenge was to find the right extension for Alon’s theorem.

5) The lower bound.

Next we explained the nice and simple construction giving the lower bound, which is based on the weights realizing the fractional stable number of H.

6) Bonami’s inequality in a dual form

Our next thing was to state a consequence of Bonami’s hypercontractive inequality, which is a direct extension of Chinchine inequality. Then we showed a weaker upper bound than the actual theorem based on the Bonami inequality .

It is an interesting open question to apply harmonic analysis to the general case. (I believe it is tractable.)

7) Traces and Shearer’s lemma

Next we defined the trace of a hypergraph on a subset W of vertex-set V and stated Shearer’s lemma.

8) More about traces and a little more extremal combinatorics

Not having enough time to complete the proof of Friedgut-Kahn theorem using Shearer’s lemma, we proved the fundamental Sauer-Shelah inequality (see this post), and stated Frankl’s conjecture (see this post, and this one  (sec 3d) ).

### Lecture 7

We started with the proof of the Friedgut-Kahn bound using Shearer’s lemma. Then we explained the simple connection between influences and traces and mentioned the connection of Shearer’s lemma (and the Loomis-Whitney theorem) to edge-sioperimetry.

Our next application: First Passage percolation. We gave a short introduction to models of percolation and started to discuss our fourth application of the Fourier+hypercontractivity method: An upper bound for the variance of first passage percolation. Here the method gives the best known result, but unlike KKL’s theorem the result is not sharp. We are third way toward a proof so I may write about it next time. The discussion of first passage percolation is based on the paper First Passage Percolation Has Sublinear Distance Variance by Benjamini, Kalai and Schramm.

# Analysis of Boolean Functions – Week 3

### Lecture 4

In the third week we moved directly to the course’s “punchline” – the use of Fourier-Walsh expansion of Boolean functions and the use of Hypercontractivity.

Before that we  started with  a very nice discrete isoperimetric question on a graph which is very much related to the graph of the discrete cube.

Consider the graph G whose vertices are 0-1 vectors of length n with precisely r ‘1’s, and with edges corresponding to vertices of Hamming distance two. (Which is the minimal Hamming distance between distinct vertices.) Given a set A of m vertices, how small can $E(A, \bar A)$ be? (We already talked about intersecting families of sets of constant size – the Erdos-Ko-Rado theorem and, in general, it is a nice challenge to extend some of the ideas/methods of the course to constant weight situation.)

And now for the main part of the lecture.

1) Basics harmonic analysis on the discrete cube. We considered the vector space of real functions on the discrete cube $\Omega_n$ and defined an inner product structure. We also defined the p-th norm for $1\le p\le \infty$. Next we defined the Fourier-Walsh functions and showed that they form a orthonormal basis. This now leads to the Fourier-Walsh expansion for an arbitrary real function f on the discrete cube $f=\sum_{S\subset[n]}\hat f(S)W_S$, and we could easily verify Parseval formula.

2) Influence and Fourier. If f is a real function on $\Omega_n$ and $f=\sum \hat f(S)W_S$ its Fourier-Walsh expansion. We showed that $I_k(f)=\sum_{S:k \in S}\hat f^2(S)$. It follows that $I(f)=\sum_S\hat f^2(S)|S|$. The Fourier-theoretic proof for I(f) ≥ 4 t (1-t) where t=μ(f) now follows easily.

3) Chinchine, hypercontractivity and the discrete isoperimetric inequality.

Next we discussed what will it take to prove the better estimate I(f) ≥ K t log t. We stated Chinchine inequality, and explained why is Chinchine inequality relevant: For Boolean functions the pth power of the p-norm does not depend on p. (It always equals t.) Therefore if t is small, the p-th norms themselves much be well apart! After spending a few moments on the history of the inequality (as told to me by Ron Blei) we discussed what kind of extension do we need, and stated  the Bonami-Gross-Beckner inequality. We use Bonami’s inequality to proof of the inequality I(f) ≥ K t log t and briefly talked about what more does it give.

### Lecture 5

1) Review and examples. We reviewed what we did in the previous lecture and considered a few examples: Dictatiorship; the AND function (an opportunity to mention the uncertainty principle,) and MAJORITY on three variables. We also mentioned another connection between influences and Fourier-Walsh coefficients: for a monotone (non decreasing) Boolean function f, $I_k(f) = -2\hat f(\{k\})$.

2) KKL’s theorem

KKL’s theorem: There is an absolute constant K so that for every Boolean function f, with t=μ(f), there exists k, 1 ≤ k ≤ n such that

$I_k(f) \ge K t (1-t) logn/n.$

To prove of KKL’s theorem: we repeat, to a large extent, the steps from Lecture 4 (of course, the proof of KKL’s theorem was where this line of argument came from.) We showed that if all individual influences are below $1/sqrt n$ than $I(f) \ge K t(1-t) \log n$.

We mentioned one corollary: For Boolean function invariant under a transitive group of permutations, all individual influences are equal and therefore $I(f) \ge K t (1-t)\log n$.

3) Further problems

In the  last part of the lecture we mentioned seven problems regarding influence of variables and KKL’s theorem (and I added two here):

1) What can be said about balanced Boolean functions with small total influence?

2) What can be said about Boolean functions for which I(f) ≤ K t log (1/t), for some constant K, where t=μ(f)?

3) What can be said about the connection between the symmetry group and the minimum total influence?

4) What can be said about Boolean functions (1/3 ≤ μ(f)≤ 2/3, say) for which $\max I_k(f) \le K log n/n$.?

5) What more can be said about the vector of influences $(I_1(f),I_2(f), \dots I_n(f))$?

6)* What is the sharp constant in KKL’s theorem?

7)* What about edge expansions of (small) sets of vertices in general graphs?

8) Under what conditions $I(f) \ge n^\beta$ for β >0.

9) What about influence of larger sets? In particular, what is the smallest t (as a function of n ) such that if $\mu(f)=t$ there is a set S of variables S ≤ 0.3n with $I_S(f) \ge 0.9$?

(This post  is a short version, I will add details later on.)

# Analysis of Boolean functions – week 2

### Post on week 1; home page of the course analysis of Boolean functions

Lecture II:

We discussed two important examples that were introduced by Ben-Or and Linial: Recursive majority and  tribes.

Recursive majority (RM): $F_m$ is a Boolean function with $3^m$ variables and $F_{m+1} (x,y,z) = F_1(F_m(x),F_m(y),F_m(z))$. For the base case we use the majority function $F_1(x,y,z)=MAJ(x,y,z)$.

Tribes: Divide your n variables into s pairwise disjoint sets (“tribes”) of cardinality t. f=1 if for some tribe all variables equal one, and thus T=0 if for every tribe there is a variable with value ‘0’.

We note that this is not an odd function i.e. it is not symmetric with respect to switching ‘0’ and ‘1’. To have $\mu=1/2$ we need to set $t=\log_2n - \log_2\log_2n+c$. We computed the influence of every variable to be C log n/n. The tribe function is a depth-two formula of linear size and we briefly discussed what are Boolean formulas and Boolean circuits (These notions can be found in many places and also in this post.).

I states several conjectures and questions that Ben-Or and Linial raised in their 85 paper:

Conjecture 1: For every balanced Boolean function with n variables there is a variable k whose influence is $\Omega (\log n/n)$.

Conjecture 2: For every balanced Boolean function with n variables there is a set S of n/log n variables whose influence $I_S(f)$ is 1-o(1).

Question 3: To what extent can the bound in Conjecture 2 be improved if the function f is odd. (Namely, $f(1-x_1,1-x_2,\dots, 1-x_n)=1-f(x_1,x_2,\dots, x_n)$.)

Our next theme was discrete isoperimetric results.  I noted the connection between total influence and edge expansion and proved the basic isoperimetric inequality: If μ(f)=t then I(f) ≥ 4 t(1-t). The proof uses the canonical paths argument.

### Lecture III:

We proved using “compression” that sharp bound on I(f) as a function of t=μ(f). We made the analogy between compression and Steiner symmetrization – a classic method for proving the classical isoperimetric theorem. We discussed similar results on vertex boundary and on Talagrand-Margulis boundary (to be elaborated later in the course).

Then We proved the Harris-Kleitman inequality and showed how to deduce the fact that intersecting family of subsets of [n] with the property that the family of complements is also intersecting has at most $2^{n-2}$ sets.

The next topic is spectral graph theory. We proved the Hoffman bound for the largest size of an independent set in a graph G.

I mentioned graph-Laplacians and the spectral bound for expansions (Alon-Milman, Tanner)..

The proofs mentioned above are so lovely that I will add them on this page, but sometime later.

Next week I will introduce harmonic analysis on the discrete cube and give a Fourier-theoretic explanation for  the additional log (1/t) factor in the edge isoperimetric inequality.

Important announcement: Real analysis boot camp in the Simons Institute for the Theory of Computing, is part of the program in Real Analysis and Computer Science. It is taking place next week on September 9-13 and has three lecture series. All lecture series are related to the topic of the course and especially:

# Analysis of Boolean Functions – week 1

In the first lecture I defined the discrete n-dimensional cube and  Boolean functions. Then I moved to discuss five problems in extremal combinatorics dealing with intersecting families of sets.

1) The largest possible intersecting family of subsets of [n];

2) The largest possible intersecting family of subsets of [n] so that the family of complements is also intersecting;

3) The largest possible family of graphs on v vertices such that each two graphs in the family contains a common triangle;

4) Chvatal’s conjecture regarding the maximum size of an intersecting family of sets contained in an ideal of sets;

Exercise: Prove one of the following

a) The Harris-Kleitman’s inequality

b) (from the H-K inequality) Every family of subsets of [n] with the property that every two sets have non-empty intersection and no full union contains at most $2^{n-2}$ sets.

More reading: this post :”Extremal combinatorics I: extremal problems on set systems“. Spoiler: The formulation of Chvatal’s conjecture but also the answer to the second exercise can be found on this post: Extremal combinatorics III: some basic theorems. (See also peoblen 25 in the 1972 paper Selected combinatorial research problems by Chvatal, Klarner and Knuth.)

I moved to discuss the problem of collective coin flipping and the notion of influence as defined by Ben-Or and Linial. I mentioned the Baton-passing protocol, the Alon-Naor result, and Feige’s two-rooms protocol.

More reading: this post :” Nati’s influence“. The original paper of Ben-Or Linial:  Collective coin flipping, M. Ben-Or  and N. Linial, in “Randomness and    Computation” (S. Micali ed.) Academic Press, New York, 1989, pp.    91-115.

# High Dimensional Expanders: Introduction I

Alex Lubotzky and I  are running together a year long course at HU on High Dimensional Expanders. High dimensional expanders are simplical (and more general) cell complexes which generalize expander graphs. The course is taking place in Room 110 of the mathematics building on Tuesdays 10-12. The first four hours were devoted to an introduction to the course that came with a disclaimer that some of the material is new to us as well, with a promise that the course will occasionally turn into a seminar featuring interesting speakers, and with a hope that perhaps we will be able to discover while running the course interesting questions to answer or ask. It will be too difficult for me  to follow the entire course over the blog but I would like to devote two posts to the Introduction with some remarks and backgrounds.

## A brief introduction.

The introduction included the following four  parts.

Introduction to Expanders: Alex briefly described the definition of expanders, their spectral properties, the relation with random walks, the construction of expanders, the construction of Ramanujan graphs, one brief application.

Introduction to high dimensional complexes: I briefly described higher dimensional topological objects and mainly simplicial complexes, and the notions of homology and cohomology.

Introduction to high dimensional expanders: I briefly went over several possible (not equivalent) definitions: Cohomological definition; geometric and topological definitions; spectral definition;

Ramanujan graphs and complexes: Alex briefly described what they are.

In this post we will go over the first three items.  The next post will discuss the fourth item, provide credits, references and links,  and also discuss examples, connections, potential applications, and questions.

## Introduction

### 1.1 The definition

Let $G$ be a graph on the vertex set $V$. $G$ is an $\epsilon$expander if for every set $U$ of vertices, $|U| \le |V|/2$, the number of edges of $G$ containing one vertex from $U$ and one vertex from $V \backslash U$ is at least $\epsilon|U|$.

The set of edges $E_G(U,\bar U)$ in $G$ from $U$ to its complement are sometimes called the edge-boundary of $U$. The expansion $h(G)$ of $G$ is defined by

$h(G)= \min\{E_G(U,\bar U): U \subset V, U \ne \emptyset, |U| \le |V|/2\}$.

### 1.2 Spectral relation

We will mainly be interested in the case where $G$ is a $k$-regular graph with $n$ vertices and $k$ is a fixed integer. Let $A$ be the adjecency matrix of $G$. The largest  eigenvalue of $A$ is $k$ and let $\lambda=\lambda_2$ be the second largest eigenvalue.

Theorem (Alon-Boppana): $\lambda\ge 2\sqrt {k-1}-o_n(1)$

Theorem (Tanner, Alon-Milman): $h(G)\ge (k-\lambda)/2$

Theorem (Alon): $h(G) \le \sqrt {2k(k-\lambda)}$

### 1.3 Random walks

For regular graphs, the expansion property (through the spectral interpretation) implies that (unless the graph is bipartite) the simple random walk on the graph $G$ converges quickly to the uniform distribution on the vertices.

We will come back to examples of expander graph and to a certain application after defining high dimensional expansion.

### 2. High dimensional complexes and homology

We defined abstract simplicial complexes and geometric simplicial complexes $K$. Then we defined the chain groups $C_i(K): i\ge -1$ with coefficients in $Z/2Z$ and the cochain groups $C^i(K)$ and we identified between them. Thus an element $x$ in the $i$th chain or cochain groups can be regarded as a linear combination of $i$-dimensional faces of $K$. The support of $x$, $supp(x)$ is the set of $i$-faces of $K$ with nonzero coefficient in $x$.

We described the boundary and coboundary operations.  Briefly, if $x$ is the $i$ chain that correspond to an $i$-face $S$, then $\partial (x)$, the boundary operation,  is the sum of all $(i-1)$ faces of $S$. This definition is extended by linearity to every $i$-chain. When we think about $x$ above as an element in the $i$th cochain group then $\delta(x)$ is the sum (modulo 2) of all $(i+1)$-faces of $K$ containing $S$.

Then we defined the vector spaces – $Z^i(K)$ of $i$-cocycles, $B^i(K)$ of $i$-coboundaries and the $i$th cohomology $H^i(K)=Z^i(K)/B^i(K)$. We also defined boundaries, cycles and homology.

Everything we said apply to homology with other fields of coefficients except that it was easier to define the boundary/coboundary operations over Z/2Z (for other fields of coefficients we need to worry about signs). We always consider reduced homology/cohomology without saying it.

### 3. High dimensional expanders

We mention several different (nonequivalent) notions of high dimensional expansions.

### 3.1 (co)Homological definition:

Let $K$ be a simplicial complex, define

$h_i(K)=\min_y \max _x~\{~{\frac{|supp(y)|}{|supp(x)|}}:$ $y \in B^{i+1}(K),~x\in (C^i(K)\backslash B^i(K)), \delta (x)=y\}$

Another way to write the same thing is this:

For $x\in C^i(K)\backslash B^i(K)$ write

$h(x)=\min\{|supp(\delta x)|/(|supp(x+\delta z)|: z \in C^{i-1}(K)\}.$

Then $h_i(K)=\min\{~h(x): x\in C^i(K)\backslash B^i(K)\}$.

If $K$ is a $d$-dimensional complex we will denote $h(K)=h_{d-1}(K)$.

Some remarks:

1) For graphs: When $K$ is a 1-dimensional complex, the cohomological definition of $h_0(K)$ coincides with the combinatorial definition. Every $x \in C^{0}(K)$ is the sum of vertices in a set $U \subset V$ of vertices. $\delta (x)$ is the sum of edges in $E(U,\bar U)$. The formula reduces to $h(G)=\min |E(U,\bar U)/(\min |U|, |V|-|U|)$ over all nonempty proper subsets $U$ of $V$.

2) $h_i(K)=0$ if and only if $H_i(K)\ne 0$.

### 3.2 Spectral definition

In this little sections we move to chain and cochain groups with real coefficients. Define the Laplacian $L_i(K): C^i(K)\to C^i(K)$ by

$L_i = \delta \partial +\partial\delta$.

Let $\lambda^i(K)$ to be the minimal eigenvalue of $L_i(K)$.

We will say that a $d$-dimensional space is an $i$-expander in the spectral sense, if $\lambda^{i}(K)$ is large. As before we are mainly interested in the case where $i=\dim (K)-1$

### 3.3 Geometric and topological definitions.

Let $K$ be a $d$-dimensional simplicial complex. Let $\phi$ be an embedding of the vertices of $K$ to $R^d$. Given a $d$-dimensional face $F$ of $K$ we denote by $\phi(F)$ the convex hull of the images under $\phi$ of the vertices of $F$.  (Alternatively we can think about $K$ as a geometric simplicial complex and about $\phi$ as a mao to $R^d$ which is an affine map on the faces of $K$.)

The overlap number w.r.t. $\phi$ and a point $x \in R^d$, denoted by $overlap(K,\phi,x)$ is the number of $d$ dimensional faces in $K$ whose image under $\phi$ contains $x$. The overlap number of $K$ is defined as

$overlap(K) = \min_{\phi} \max_{x \in R^d} overlap (K,\phi,x)$.

The topological overlap number of $K$ is defined the same except that you allow $\phi$ to be a continuous function from $K$ (regarded as a geometric simplicial complex) into $R^d$.

Note that for graphs, large expansion implies large overlap number.

Next post: Examples, relations between the various definitions, basic research questions. Ramanujan graphs and complexes. Applications: error correcting codes, qantum codes.  More remarks, credits and and links to relevant papers.

# In how many ways you can chose a committee of three students from a class of ten students?

The renewed interest in this old post, reminded me of a more recent event:

Question: In how many ways you can chose a committee of three students from a class of ten students?

My expected answer: ${10} \choose {3}$ which is 120.

Alternative answer 1:(Lior)  There are various ways: you can use majority vote, you can use dictatorship (e.g. the teacher chooses); approval voting, Borda rule…

Alternative answer 2: There are precisely four ways: with repetitions where order does not matter; with repetitions where order matters; without repetitions where order matters; without repetitions where order does not matter,

Alternative answer 3: The number is truly huge. First we need to understand in how many ways we can choose the class of ten students to start with. Should we consider the entire world population? or just the set of all students in the world, or something more delicate? Once we choose the class of ten students we are left with the problem of chosing three among them.

# Test Your Intuition (11): Is it Rational to Insure a Toaster

Here is a question from last year’s exam in the course “Basic Ideas of Mathematics”:

You buy a toaster for 200 NIS ($50) and you are offered one year of insurance for 24 NIS ($6).

a) Is it worth it if the probability that damage covered by the insurance will occur during the first year is 10%? (We assume that without insurance, such damage makes the toaster a “total loss”.)

b) Is it worth it if the probability that the toaster will be damaged is unknown?

As an additional test of your imagination, can you think of reasons why buying the toaster insurance would be rational?

# The Beauty of Mathematics

This semester I am teaching an introductory course in mathematics for students in other departments.  I taught a similar course last year entitled “Basic Ideas in Mathematics,” and this year, following a suggestion of my wife, I changed the name to “The Beauty of Mathematics”. Another change is that starting this year we have a general program in the university, called “Cornerstones“, (initiated by the rector,) whose purpose is to widen the education we offer to our students, and this course is part of the new program.

Talking about beauty rather than about basic ideas, combined with the new Cornerstone program have led many  more students to enlist to the course this year, and subsequently the lectures will use computer presentations.

Of course, the challenge has become  harder. I truly think mathematics is beautiful, but trying to convey its beautiful facets has never been easy. Also, I do not want to sweep under the rug the difficulty of mathematics, and the students will have to learn some basic mathematical skills and some abstract mathematics.   Ideas and suggestions are most welcome.

What do you regard as a great example of the beauty of mathematics?

There will be  is a separate page devoted to the course and the slides for the first lecture (in Hebrew) are linked there (and here). The first lecture was devoted to “numbers.”