# Extremal Combinatorics III: Some Basic Theorems

.

Shattering

Let us return to extremal problems for families of sets and describe several basic theorems and basic open problems. In the next part we will discuss a nice proof technique called “shifting” or “compression.”

## The Sauer-Shelah (-Perles -Vapnik-Chervonenkis) Lemma:

(Here we write $N = \{1,2,\dots,n\}$.) A family ${\cal F} \subset 2^N$ shatters a set $S \subset N$ if for every $R \subset S$ there is $F \in {\cal F}$ such that $S \cap F=R$. Note that in order to shatter a set of size $m$  you need at least $2^m$ sets. But how many sets will guarantee that you shatter a set with $m$ elements? The Sauer-Shelah Lemma answers this question!

Theorem (Sauer-Shelah): If  $\cal F > {n \choose 0} + {n \choose 1} + \dots + {n \choose k}$  then there exists a set $S$, $|S|=k+1$ such that $\cal F$ shatters $S$.

Hmm LaTeX can go wild, let me try that again:

Theorem (Sauer-Shelah): If  $|{\cal F}| > {n \choose 0} + {n \choose 1} + \dots + {n \choose k}$ then there exists a set $S$, $|S|=k+1$ such that $\cal F$ shatters $S$.

(This is much better.)

The simplest way to prove this theorem is by induction. Here is a variation I just heard from Noga. Proof (a little sketchy): Prove a slightly stronger fact that every family $\cal F$ shatters at least as many sets as $|{\cal F}|$. Consider the subfamily ${\cal F}_0$ of sets in the family not containing $'1'$. By induction ${\cal F}_0$ shatters at least as many subsets of $N' = \{2,3, \dots , n\}$ as $|{\cal F}_0|$. Next consider ${\cal F}_1 =$ $\{S \in {\cal F}: 1 \in S\}$, and ${\cal F}'_1$ = $\{S \backslash \{1\}:$ $S \in {\cal F}, 1 \in S \}$.  By induction ${\cal F}'_1$ shatters as many subsets of $\{2,3,\dots,n\}$ as its cardinality. The number of subsets of $N'$ shattered by ${\cal F}_0$ and ${\cal F}'_1$ sum up to at least $|{\cal F}_0|$+$|{\cal F}'_1|$ = $|{\cal F}|$, and every subset of $N'$ shattered by ${\cal F}'_1$ is shattered by ${\cal F}_1 \subset {\cal F}$. Sababa? not quite! there may be subsets $R \subset N'$ shattered by both ${\cal F}_0$ and ${\cal F}'_1$. But note that in this case both $R$ and latex $R \cup \{1\}$ are shattered by $\cal F$. walla!

OK, let me give a few more words of explanation about the proof. When I say “every family” do I mean “every family that satisfies the condition of the theorem?” (And If I do why can I use induction?) No, i really mean every family. The point is that if every family shatters  at least as many sets as its size then a family which satisfies  $|{\cal F}| > {n \choose 0} + {n \choose 1} + \dots + {n \choose k}$ shatters more sets than this sum. But then it must shatter a set of size larger than $k$ because there are not enough subsets of the set $\{1,2,\dots,n\}$ of size at most $k$.

One additional word about the proof. Considering the sets containing ‘1’ and those not containing ‘1’ and then applying an inductive argument which can be simple or extremely complicated, direct or involving several additional methods, is a very basic proof method in extremal set theory.

This theorem can be described as an “eigentheorem” because it is important in many different places. It was mentioned (with an algebraic proof by Frankl and Pach) in Gowers’ blog and also, in another context, in Kowalski’s blog. Sauer proved it in response to a problem of Erdos. Shelah (with Perles) proved it as a useful lemma for Shelah’s theory of stable models. (At some later time, Benjy Weiss asked Perles about such a result in the context of ergodic theory and Perles who forgot that he proved it once proved it again.) Vapnik and Chervonenkis proved it in the context of statistical learning theory. The VC-dimension of a family of sets is the size of the largest set the family shatters. This notion plays an important role in learning theory, statistics and probability theory. There are several applications of the Sauer-Shelah theorem to analysis and to the geometry of Banach spaces, and there are several extensions and refinements.

## The Kruskal-Katona Theorem

We let $\bf N$ denote the set of positive integers and let ${{\bf N} \choose {k}}$  = $\{S: S \subset {\bf N}, |S|=k\}$.

Let ${\cal F}$ be a subset of ${{\bf N} \choose {k}}$. The Shadow $\partial {\cal F}$ is the set of all (k-1)-subsets of $\bf N$  which are contained in at least one set in $\cal F$.

We would like to answer the following question. How small can the Shadow of a family of m k-sets be? The precise answer to this question is given by the Kruskal-Katona’s theorem.

# New Haven (mainly pictures)

Yale, New Haven

I am back in New Haven which have become my home away from home in the last five years.

Cappuccino’S and more – Cedar cross Congress, New Haven. Not only that this name is similar to my blog’s name, but throwing in as many s’s and apostrphees reflects also my approach to the English grammer.

# Annotating Kimmo Eriksson’s Poem

“Start counting her NUMBER OF FACES,” Kimmo Eriksson, Brush up your Björner (2008).

The time is right to annotate Kimmo Eriksson’s memorable poem:

## 1. What are Chip firing games?

Many women will find it admirable
if you tell her she makes your CHIPS FIRABLE.

Chip firing games are (solitary) games played on graphs: each node of a directed graph contains a pile of chips. A move consists of selecting a node with at least as many chips as its outdegree, and sending one chip along each outgoing edge to its neighbors. The remarkable property of these games is that depending on the sizes of the piles, either the game continues forever or it reaches a position where it cannot be played further. In the latter case the final position, the number of moves, and even the number of times each node fired do not depend on the specific moves made along the game.

Sheep firing game (You may play the game by clicking on the picture; disclaimer: we object to cruel treatment of animals)

In the rest of this post you can read about shellability, weak and strong (Bruhat) partial orders on the set of permutations, chessboard complexes, and more.  Continue reading

# A Diameter Problem (5)

### 6. First subexponential bounds.

Proposition 1: $F(d,n) \le F_k(d,n) \times F(d-k,n-k).$

How to prove it: This is easy to prove: Given two sets $S$ and $T$ in our family $\cal F$, we first find a path of the form $S=S_0, S_1, S_2, \dots, S_t = T$ where, $|S_{i-1} \cap S_i| \ge k$ and $t \le F_k(d,n)$. We let $B_i \subset (S_{i-1} \cap S_i )$ with $|B_i|=k$ and consider the family ${\cal F}'[B_i]$. This is a family of $(d-k)$-subsets of an $(n-k)$ set ($N \backslash B_i$) . It follows that we can have a path from $S_{i-1}$ to $S_i$ in  $G({\cal F}[B_i])$ of length at most $F(d-k,n-k)$. Putting all these paths together gives us the required result. (We remind the notations at the end of this post.)

How to use it: It is not obvious how to use Proposition 1. Barnette’s argument from part 3 was about $k=1$, and it used something a bit more sophisticated. Applying Proposition 1 directly for $k=1$ does not give anything non trivial. However, when $n$ is small compared to $d$ and $k$ is a small fraction of $d$ we can say something.

Let us start with an example: $n = 3d$. let us choose $k= d/4$. Consider a path $S=S_0,S_1,\dots S_t=T$ in $G({\cal F})$ from two sets $S$ and $T$. Suppose also that in this path

(*) $S_i \cap T \subset S_{i+1}\cap T$, for every $i$

Let $U_1$ be the last set in the path whose intersection with $S$ has at least $d/4$ elements.  Let $U_2$ be the last set in the path whose intersection with $U_1$ has at least $d/4$ elements. I claim that $S, U_1, U_2, T$ is a path in $G_{d/4}({\cal F})$.

# Diameter Problem (4)

Let us consider another strategy to deal with our diameter problem. Let us try to associate other graphs to our family of sets.

Recall that we consider a family $\cal F$ of subsets of size $d$ of the set $N= \{ 1,2,\dots, n \}$.

Let us now associate  more general graphs to $\cal F$ as follows: For an integer $k$ $1 \le k \le d$ define $G_k({\cal F})$ as follows: The vertices of  $G_k({\cal F})$ are simply the sets in $\cal F$. Two vertices $S$ and $T$ are adjacent if $|S \cap T| \ge k$. Our original problem dealt with the case $k=d-1$.  Thus, $G({\cal F}) = G_{d-1}({\cal F})$. Barnette proof presented in the previous part refers to $G_1({\cal F})$ and to paths in this graph.

As before for a subset $A \subset N$ let ${\cal F}[A]$ denote the subfamily of all subsets of $\cal F$ which contain $A$. Of course, the smaller $k$ is the more edges you have in $G_k({\cal F})$. It is easy to see that assuming that $G_1({\cal F[A]})$ is connected for every $A$ for which ${\cal F}[A]$ is not empty already implies our condition that $G({\cal F[A]})$ is connected for every $A$ for which ${\cal F}[A]$ is not empty.

Let $F_k(d,n)$ be the maximum diameter of $G_k({\cal F})$  in terms of $k,d$ and $n$, for all families $\cal F$ of $d$-subsets  of $N$ satisfying our connectivity relations.

Here is a simple claim:

$F(d,n) \le F_k(d,n) \times F(d-k,n-k).$

Can you prove it? Can you use it?

# Diameter Problem (3)

### 3. What we will do in this post and and in future posts

We will now try all sorts of ideas to give good upper bounds for the abstract diameter problem that we described. As we explained, such bounds apply to the diameter of graphs of simple d-polytopes.

All the methods I am aware of for providing upper bounds are fairly simple.

(1) You think about a strategy from moving from one set to another,

(2) You use this strategy to get a recursive bound,

(3) You solve the recursion and hope for the best.

What I would like you to think about, along with reading these posts, is the following questions:

(a) Can I come up with a different/better strategy for moving from one set to the other?

(b) Can I think about a mathematically more sophisticated way to get an upper bound for the diameter?

(c) Can this process of finding a strategy/writing the associated recurrence/solving the recurrence be automatized? The type of proofs we will describe are very simple and this looks like a nice example for a “quasi-automatic” proof process.

Let me repeat the problem and prove to you a nice upper bound:

### Reminder: Our Diameter problem for families of sets

Consider a family $\cal F$ of subsets of size d of the set N={1,2,…,n}.

Associate to $\cal F$ a graph $G({\cal F})$ as follows: The vertices of  $G({\cal F})$ are simply the sets in $\cal F$. Two vertices $S$ and $T$ are adjacent if $|S \cap T|=d-1$.

# Oded

I just heard the terrible news that Oded Schramm was killed in a hiking accident. Oded was hiking on Guye Peak near Snoqualmie Pass near Seattle. This is a terrible loss to Oded’s family, and our hearts and thoughts are with his wife Avivit and children Tselil and Pele. This is a great personal loss to me, to his many friends, and to mathematics.