## A Diameter problem (7): The Best Known Bound

### Our Diameter problem for families of sets

Consider a family $\cal F$ of subsets of size d of the set N={1,2,…,n}.

Associate to $\cal F$ a graph $G({\cal F})$ as follows: The vertices of  $G({\cal F})$ are simply the sets in $\cal F$. Two vertices $S$ and $T$ are adjacent if $|S \cap T|=d-1$.

For a subset $A \subset N$ let ${\cal F}[A]$ denote the subfamily of all subsets of $\cal F$ which contain $A$

MAIN ASSUMPTION: Suppose that for every $A$ for which ${\cal F}[A]$ is not empty $G({\cal F}[A])$ is connected.

We will call a family satisfying this assumption “hereditarily connected”.

MAIN QUESTION:   How large can the diameter of $G({\cal F})$ be in terms of $d$ and $n$?

We denote the answer by $F(d,n)$.

For $v \in \{1,2,\dots,n\}$ let ${\cal F'}[v]$ be the family obtained from ${\cal F}[\{v\}]$ by removing $v$  from every set. Since $G({\cal F}[v]) = G({\cal F}' [v])$, the diameter of  $G({\cal F}[\{v\}])$ is at most $F(d-1,n-1)$.

### 8. A slight generalization

Let ${\cal F}$ be an hereditarily connected family of $d$-subsets of a set $X$. Let $Y$ be a subset of $X$. The length of a path of sets $S_1,S_2,\dots, S_t$ modulo Y  (where $|S_i \cap S_{i+1}|=d-1$ for every $i$) is the number of $j, 1 \le j such that both $S_j$ and $S_{j+1}$ are subsets of $Y$. (In other words, in $G({\cal F})$ we consider edges between subsets of Y as having length 1 and other edges as having length 0.)

Let $T(d,n)$ be the largest diameter of an hereditarily connected family of $d$-subsets of an arbitrary set $X$ modulo a set Y , $Y \subset X$ with $|Y|=n$.

Since we can always take $X=Y$ we have $F(d,n) \le T(d,n)$.

### 9.  A quasi-polynomial upper bound

We will now describe an argument giving a quasi-polynomial upper bound for $T(d,n)$. This is an abstract version of a geometric argument of Kleitmen and me.

Let ${\cal F}$ be a hereditarily connected family of $d$-subsets of some set $X$, let $Y \subset X$, $|Y|=n$, and let $S$ and $T$ be two sets in the family.

Claim: We can always either

1) find paths of length at most $T(d,k)$  modulo from $S$ to $d$-subsets of $Y$ whose union has more than $k$ elements.

or

2) we can find a path of this length $T(d,k)$  modulo Y   from $S$ to $T$.

Proof of the claimLet $Z$ be the set of elements from $Y$ that we can reach in $T(d,k)$ steps modulo Y   from $S$. (Let me explain it better: $Z$ is the elements of $Y$ in the union of all sets that can be reached in $T(d,k)$ steps modulo Y from $S$. Or even better: $Z$ is the intersection of $Y$ with the union of all sets in $\cal F$ which can be reached from $S$ in $T(d,k)$ steps modulo Y. )

The distance of $S$ from $T$ modulo Z  is at most $T(d,|Z|)$.

Now, if $|Z|>k$ we are in case 1).

If $|Z| \le k$ then there is a path from $S$ to $T$ modulo Z of length $T(d,k)$. If this path reaches no set containing a point in $Y \backslash Z$ we are in case 1).  (Because this path is actually a path of length $T(d,k)$ from $S$ to $T$ modulo Y).  Otherwise, we reached via a path of length $T(d,k)$ modulo Y from $S$ a set containing a point in $Y \backslash Z$, in contradiction to the definition of $Z$.  Walla.

Corollary: $T(d,n) \le 2T(d,n/2)+T(d-1,n-1)$.

By a path of length $T(d,n/2)$ modulo Y  we reach from $S$ at least $n/2$ elements in $Y$, (or $T$).  By a path of length $T(d,n/2)$ modulo Y  we reach from $T$ at least $n/2$ elements in $Y$, (or $S$). So unless we can go from $S$ to $T$ in $T(d,n/2)$ steps modulo Y  we can reach more than $n/2$ elements from both $S$ and $T$ by paths of length $T(d,n/2)$ modulo Y ,hence there is some element we can reach from both.

In other words in $T(d,n/2)$ steps modulo Y  we go from $S$ to $S'$ and from $T$ to $T'$ so that $S'$ and $T'$ share an element $u$.

But the distance from $S'$ to $T'$ modulo Y  (which is the same as the distance modulo Y  from $S' \backslash u$ to $T' \backslash u$ in ${\cal F}'[\{u\}]$ is at most $T(d-1,n-1)$.  (We use here the fact that $u \in Y$) Ahla!

To solve the recurrence, first for convenience replace $T(d-1,n-1)$ by $T(d-1,n)$. (You get a weaker inequality.) Then write $G(d,n)=T(d,2n)$ to get $G(d,n) \le G(d,n/2)+G(d-1,n)$ and $H(d,x)=G(d,2^x)$ to get $H(d,x) \le H(d-1,x) + H(d,x-1)$ which gives $H(d,x) \le {{d+x} \choose {d}}$ which in turn gives $G(d,n) \le {{log n+d} \choose {d}}$ and $T(d,n) \le n {{log n +d} \choose {d}}$. Sababa!

This is the last post in the series. The proof presented here is an abstract version of a geometric proof for graphs of polytopes by Kalai and Kleitman. Different paths to weaker quasi-polynomial upper bounds can be found here. These bounds are linear when $d$ is fixed.  A similar (even a bit simpler) argument under an even more general context was found by Razborov. (But I don’t remember it at present.) The argument above extends to the directed case. But finding an actual pivot rule for the simplex algorithm which comes close to this bound is out of reach.

I conjecture that $F(d,n)$ is polynomial (and that this holds for $T(d,n)$ and even in the greater generality considered by Razborov). I also conjectured before that it is not a polynomial, but changed my mind. So frankly, I do not have a clue. Remember that it is even possible that $F(d,n) \le n$.

Summary of earlier posts: Part 1 describes the problem. (It is repeated here.) Part 2 describe the connection to the Hirsch Conjecture.  Part 3 describes linear bound when $d$ is fixed. It also raises the question if past (or future) developements on the problem can be quasi-automatize.  Part 5 follows a question from part 4 and describes a subexponential upper bound. Part 6 describes further the connection with linear programming and with shellability, and poses a directed version of the problem.