Polymath3: Polynomial Hirsch Conjecture 4

So where are we? I guess we are trying all sorts of things, and perhaps we should try even more things. I find it very difficult to choose the more promising ideas, directions and comments as Tim Gowers and Terry Tao did so effectively in Polymath 1,4 and 5.  Maybe this part of the moderator duty can also be outsourced. If you want to point out an idea that you find promising, even if it is your own idea, please, please do.

This post has three parts. 1) Around Nicolai’s conjecture; 1) Improving the upper bounds based on the original method; 3) How to find super-polynomial constructions?

1) Around Nicolai’s conjecture

Proving Nicolai’s conjecture

Nicolai conjectured that f^*(d,n) \le d(n-1)+1 and this bound, if correct, is sharp as seen by several examples. Trying to prove this conjecture is still, I feel, the most tempting direction in our project. The conjecture is as elegant as Hirsch ‘s conjecture itself.

Some role models: I remember hard conjectures that were proved by amazingly simple arguments, like in Adam Marcus’s and Gabor Tardos’s proof of the Stanley-Wilf conjecture, or by an ingenious unexpected algebraic proof, like Reimer’s proof of the Butterfly lemma en route to the Van den Berg Kesten Conjecture. I don’t have the slightest idea how such proofs are found.

More general settings.

In some comments, participants offered even more general conjectures with the same bound which may allow some induction process to apply. (If somebody is willing to summarize these extensions, that would be useful.)

Do you think that there is some promising avenue to attack Nicolai’s conjecture?

Deciding the case d=3.

Not much has happened on the f^*(3,n) front.

What about f(d,n)?

ERSS do not give a quadratic lower bound for f(d,n) but only such a bound up to a logarithmic factor. Can the gap between sets and multisets be bridged?

And what about f(2,n); do we know the answer there?

Disproving Nicolai’s conjecture

This is a modest challenge in the negative direction. The conjecture is appealing but the evidence for it is minimal. This should be easier than disproving PHC.

2) Improving the upper bounds based on the original method.

Remember that the recurrence relation was based on reaching the same element in sets from the first f^*(n/2) families and from the last f^*(n/2) families.  The basic observation is that in the first f^*(k)+1 families, we must have multisets covering at least k+1 elements altogether.

There should be some “tradeoff”: Either we can reach many elements much more quickly, or else we can say something about the structure of our families which will help us.

What will this buy us? If we replace f(n/2) by f(n/10) the effect is small, but replacing it by f(\sqrt n) will lead to a substantial improvement (not yet PHC).

Maybe there is hope that inside the “do loop” we can cut back. We arrived at a common ‘m’ by going f(n/2) from both ends. We can even reach many ‘m’s by taking f(2n/3) steps from both ends. But then when we restrict ourselves to sets containing ‘m’, do we really start from scratch? This is the part of the proof that looks most wasteful.

Maybe looking at the shadows of the families will help. There were a few suggestions along these lines.

What do you regard as a promising avenue for improving the arguments used in current upper bound proofs?

3) How to find super-polynomial constructions?

Well, I would take sets of small size compared to n. And we want the families to be larger as we go along, and perhaps also the sets in the families to be larger. What about taking, say, at random,  in {\cal F}_1 a few small sets, and in ${\cal F}_2$ much larger sets and so on? Achieving convexity (condition (*)) is difficult.

Jeff Kahn has (privately) a general sanity test against such careless suggestions, even if you force this convexity somehow: See if the upper bound proof gives you a much better recurrence. In any case, perhaps we should carefully check such simple ideas before we try to move to more complicated ideas for constructions? Maybe we should try to base a construction on the upper bound ideas. In some sense, ERSS constructions and even Nicolai’s simple one resemble the proof a little. But it goes only “one level”. It takes a long time to reach from both ends sets containing the same element, but then multisets containing the common ‘m’  use very few elements. What about Terry’s examples of families according to the sum of indices? (By the way, does this example extend to d>3?) Can you base families on more complicated equations of a similar nature?

Anyway, it is perhaps time to talk seriously about strategies for counterexamples.

What do you think a counterexample will look like?


About these ads

73 thoughts on “Polymath3: Polynomial Hirsch Conjecture 4

  1. Pingback: Tweets that mention Polymath3: Polynomial Hirsch Conjecture 4 | Combinatorics and more -- Topsy.com

  2. Pingback: Polymath3 « Euclidean Ramsey Theory

  3. bon

    Sorry if this is an obvious question (I am a graduate student), but I was wondering why for Nicolai’s conjecture showing that f*(d,n)=d(n-1)+1 for the case when the families of monomials actually contain only one monomial each isn’t enough.
    If you have a set F_1,…F_t of families of monomials satisfying the given condition then I think you can construct a set a_1,…,a_t of monomials satisfying the given condition (with a_i in F_i):
    Take a_1 in F_1 and a_t in F_t arbitrarily.
    Then take a_2 and a_(t-1) arbitrarily from F_2 and F_(t-1) such that they satisfy the necessary gcd property given by a_1 and a_t.
    Now choose a_3 and a_(t-2) arbitrarily such that they satisfy the gcd conditions from a_2 and a_(t-1). then they automatically satisfy the gcd conditions given by a_1 and a_t.
    Thus we have a_1,..,a_t satisfying the gcd property. But this can’t be longer than d(n-1)+1 (I think this has been shown for monomials but if not I think you can show it by a certain lexicographic ordering (depending on a_1,….,a_n) on the variables x_1,….x_n.).
    I’m sure I must be missing something but I thought I’d be bold.
    In any event thanks a lot for sharing your guys work.

    1. Paco Santos

      Bon, the reason why that does not work is that you may run out of monomials to be used in the next level because some variables have been abandoned in a previous step.

      You may want to look at the following example. It does not have maximal length (Hahnle’s bound here would be 9 instead of 8) but it illustrates the point:

      [{11}, {15}, {14, 55}, {12, 35, 44}, {13, 25, 45}, {24, 34}, {23}, {22}]

      In the third level you need to choose between 14 and 55. The first choice means that you will not be allowed to use the variable 5 again, the second that you will not be able to use 1 again. You can check that there is no choice leading all the way to the final monomial 22.

      The example has been adapted from this example by Jason Conti, but there may be simpler ones.

  4. Paco Santos

    Since we believe Nicolai’s nice conjecture, we believe that $\latex f^*(d,n) = f^*(n-1,d+1)$. Do we believe this formula has any significance or is it just a coincidence?

    Certainly, the number of monomials of degree d in n variables equals the number of monomials of degree n-1 in d+1 variables. But the bijection(s) between monomials do(es) not seem to preserve convexity at all. In fact, the first problem is that in order to set up a bijection you need to order your variables, which do not come naturally ordered.

  5. Gil Kalai Post author

    Regarding Bon’s question. Perhaps we can generalize Nicolai’s conjecture from disjoint families of monomials to disjoint subspaces in the vector spaces of monomials. And then maybe for the generalized question we can hope that we can always reduce the dimension of the subspaces while keeping the convexity (yet to be defined) condition.

  6. Yury Volvovskiy

    I’ve been thinking about an approach that I’m not sure leads anywhere but for what it’s worth, here it is.

    We study a function f(d,n). Fix some number \alpha, 0<\alpha n\alpha. Then there are three sub-cases:
    (b1) U_{t/2} intersects both U_1 and U_t. Let $s$ be the largest index such that the intersection of U_s and U_1 is non-empty. Note that the intersection of U_{s+1} and U_t is non-empty too. Therefore, t\le f(d-1,n-1)+f(d-1,n-d-1).

    (b2) U_{t/2} intersects one of U_1 and U_t, say U_1. Then t/2\le f(d-1,n-d-1).

    (b3) U_{t/2} intersects neither U_1 nor U_t. Let $s$ be the largest index such that the intersection of U_s and U_{t/2} is non-empty. We can assume that \left|\cup_{j=s+1}^t U_{j}\right|\le \frac{1-\alpha}{2}\,n (otherwise we’ll look at the other side of the sequence). So
    t/2 \le f(d-1,n-d-1)+f\left(d, \frac{1-\alpha}{2}\,n\right).

    In all cases we get some upper bound on t. One obvious question I don’t have an answer for is what is a good choice of \alpha?

    1. Yury Volvovskiy

      Sorry, a part of my comment has disappeared for some reason. I meant to consider a number \alpha (between o and 1), a sequence of length t and two cases:
      a) \left|U_{t/2}\right|\le n\alpha and b) \left|U_{t/2}\right|\ge n\alpha. The three sub-cases of the latter case survived in the main comment.

      In the former case, we can deduce that t/2\le f\left(d, \frac{1+\alpha}{2}\,n\right), because one of the two halves of the sequence is supported on at most \frac{1+\alpha}{2}\,n elements.

    2. Paco Santos

      So, you get that f (d,n) is at most the maximum of the following four quantities (I use a instead of alpha to avoid latex):

      a) 2 f (d, (1+a)n/2),
      b1) f (d-1, n-1) + f (d-1, n-d-1),
      b2) 2 f (d-1, n-d-1),
      b3) f (d-1, n-d-1) + f (d, (1-a)n/2).

      My only observation is that the maximum will always be either (b1) or (a): (b2) is smaller than (b1), and twice (b3) is smaller than (a) + (b2).

      1. Paco Santos

        Upps, I forgot a factor of 2 in (b3), which gives t is at most 2f (d-1, n-d-1) + f (d, (1-a)n/2). In particular, (b2) is smaller than (b3) and the maximum lies between (a), (b1) and (b3).

    3. Yury Volvovskiy

      Let me try to show the idea in a different setting where it might be more productive. Consider now non-uniform sequence of length t on n elements, and let’s look at the supports U_{t/N} and U_{(N-1)t/N}, where N is some number, potentially depending on n. If either of the two supports has size less than or equal to n/2 we can conclude that where are at most 3n/4 elements on one side of it, and, therefore t\le N\, f(3n/4). On the other hand, if both supports have size greater than n/2 then they have an element in common and thus t\le \frac{N}{N-2} f(n-1).

      Now, if we choose N to be a constant than the first estimate is polynomial and the second one is exponential which is not interesting. But what if one sets N=n? Then the second estimate is linear and the first one gives something like n^{\log n}, which is much more interesting but still nothing new. There must be a yet better choice for N.

      1. Yury Volvovskiy

        Well, I guess not. Looks like the best choice for N is N-2 = \frac{f(n-1)}{f(3n/4)} in which case we have f(n)\le f(n-1)+2f(3n/4). That’s a worse estimate than what we already had.

      2. Yury Volvovskiy

        It’s quite easy to upgrade the inequality to f(n)\le f(n-1)+2f(n/2). All one has to do is to look at the union of supports from U_1 to U_{t/N} as of course was done in the original argument. So I’m not sure any extra mileage can be extracted form this whole thing.

      3. Paco Santos

        My general impression is that we know how to take advantage of a sequence being “thin” (case (a) in Yury’s post this morning) but we don’t know how to take advantage of it being “thick”.

        One would expect that if, say, half of the elements are used all the way from t/4 to 3t/4 this fact should imply something stronger then t/2 \le f(n-1), which is implied already by a single element being used all the way from t/4 to 3t/4…

  7. Olivier Bousquet

    I have looked at the proof of the upper bound for f^*(d,n) (ie Lemma 1 and Corollary 1 of Polymath3) and I had the following idea which does not quite work but may inspire others:

    As usual, we start with a sequence of d-uniform multiset families F_1,\ldots,F_t.
    Let’s partition the sequence F_1,\ldots,F_t into intervals in the following way:
    We first pick a random element say x_1 of the support of F_1 and denote by I_1 the interval [1,i_1] where i_1 is the last index of an F_i containing x_1.
    Then we pick a new element x_2 in the support of F_{i_1+1} and denote by I_2 the interval [i_1+1,i_2] where i_2 is the last index of an F_i containing x_2. And so on until the end.
    By definition, after i_k, x_k cannot appear anymore since when one element is removed from the support it cannot be added again.
    This means that our sequence of intervals I_k can contain at most n elements.
    Also, if we restrict the F_i for i \in I_k by excluding x_k, we get a convex sequence of d-1-uniform multiset families.
    So we have as in the original proof f^*(d,n)\le \sum_{k} f^*(d-1,|S_k|).

    Now assume that we would be able to prove that \sum_{k} |S_k| \le 2n-1, then together with the fact that we have at most n intervals, and using the induction hypothesis of f^*(d-1,n)\le (d-1)(n-1)+1, then we would obtain
    f^*(d,n)\le (d-1)(\sum_k|S_k| - n) + n \le (d-1)(2n-1-n)+n = d(n-1)+1.

    So this would give exactly the right bound, but there are two things that do not exactly work:

    The first one is that this is valid only if we have exactly n intervals.
    The second one is that we need to prove that \sum_k |S_k|\le 2n-1 which is not as easy as in the original proof since we don’t have disjointness for non-consecutive intervals.

    But the interesting point that this illustrates is that if one can decompose the sequence into exactly n intervals with the property that each interval contains at least one common element and that \sum_k |S_k|\le 2n-1, then we can prove the bound f^*(d,n)\le d(n-1)+1. How realistic is such a construction?

    1. Olivier Bousquet

      Another way to formulate the above is the following: if we can always decompose the sequence [1,t] into intervals such that \sum_k (|S_k|-1)\le n-1 and k\le n then we can prove Nikolai’s conjecture (I am using here the notation of Corollary 1 of Polymath3). Note that since the intervals satisfy the condition that the F_i in intervals I_k don’t contain the common element in I_\ell for any \ell < k, then k has to be smaller than n. So only the sum condition needs to be verified.

      It seems that this is the case (on some simple examples I looked at), provided one can construct the sequence without being forced to start at one end. So in other words, we need to find a decomposition which minimizes \sum_k (|S_k|-1), and on some examples this seems to be possible and yield the desired properties.

  8. Paco Santos

    I can now show that f(5) is at most 12. Since we have Conti’s example of length 11, f(5) must be 11 or 12. In fact, as a byproduct of my proof I have also found a second example of length 11: [{}, {1}, {12}, {125}, {15, 25}, {135, 245}, {145, 235}, {35, 45}, {345}, {34}, {4}].

    Suppose we have a sequence of length 13 on 5 elements. Wlog the first or last level consists only of the empty set, so we have a sequence of length 12 with no empty sets. Then:

    – F_1 \cup F_2 \cup F_3 already use at least three elements: if not, they form the unique sequence of length three with two elements and no empty set, namely [{1}, {12}, {2}]. But in this case the element {1} has already been abandoned in F_3, so it will not be used again. This means that F_3 … F_12 forms a convex sequence of length 10 in four elements, a contradiction.

    – With the same argument, F_10 \cup F_11 \cup F_13 use at least three elements. In particular, F_3 and F_10 have a common element, say 5, so restricting F_3,…,F_10 to the sets using 5 we have a sequence of length 8 on the other four elements. So far so good, since f(4)=8.

    – But this would imply that in the restriction we can assume wlog that F_3={\emptyset}. Put differently, F_3 contains the singleton {5}. Since F_3 is the first level using 5, this singleton could be deleted from F_3 without breaking convexity. This gets us back to the case where F_1\cupF_2\cup F_3 use only two elements, which we had discarded.

    1. Paco Santos

      I forgot to mention how I got the “byproduct”. My proof gives quite some information on what a sequence of length 12 with five elements should look like. Leaving aside the level with the empty set, which we assume to be F_12:

      – F_1 and F_2 use only two elements and F_2 uses both of them. That is, wlog our sequence starts either [{1}, {12}, …] or [{12}, {1}{2}, …]. In fact, if the second happens we can swap F_1 and F_2 and then remove from F_1 one of the two singletons. So, wlog [F_1,F_2] = [{1},{12}]
      – Same argument for F_10 and F_11. Wlog [F_10,F_11]=[{34},{4}]
      – All of F_3,…,F_9 use 5 and none of them contains the singleton {5}. That is, restricting them to 5 we have a sequence of length seven with no empty set.

      One way of constructing F_3…F_9 (and maybe the only one, although I don’t have a proof) would be a sequence of length seven on four elements that starts with {12} and ends with {34}. I did not find that, but I found (more precisely, we know since some weeks ago) one of length six: [{12}, {1, 2}, {13, 24}, {14, 23}, {3, 4}, {34}]. Joining it to 5 and then adding the head and tail of length two plus the empty set gives the sequence of length 11:

      [{1}, {12}, {125}, {15, 25}, {135, 245}, {145, 235}, {35, 45}, {345}, {34}, {4}, {}]

      1. Paco Santos

        Hum, when I said “maybe the only one” I was too quick. Jason Conti’s sequence was constructed differently. Relabeling it to better match my notation his sequence is
        [{1}, {12}, {125}, {15, 25}, {135, 2}, {13, 35, 24}, {5, 23, 14},{3, 45}, {34}, {4}, {}]

        The head and tail are indeed as in my proof [{1}, {12}, …, {34}, {4}, {}] (in fact, the arguments in the proof imply every sequence of length more than ten can be easily modified to have precisely that head and tail).

        But the restriction to 5 is different and finishes with the singleton 4: [{12}, {1,2}, {13}, {3}, {}, {4}]. Joining this to 5 and simply adding the head and tail does not give a convex sequence, because the subsequence has abandoned the element 3 and we use it again in the tail, but convexity is restored by extra sets in the central part not using 5.

  9. Paco Santos

    What would be really nice is to use the arguments in the proof of f(5) \le 11 to strengthen the recursion f(n) \le 2f(n/2) + f(n-1).

    Part of the argument is that this formula overcounts the empty set not once (which gives the -1 in the wiki) but twice. That is not important asymptotically. The other part vaguely says:

    “For f(n) to be close to 2f(n/2) + f(n-1) we need our sequence to consist of:
    – A head and a tail of lengths close to f(n/2), with disjoint supports.
    – A central part of length close to f(n-1) and using an element all the way.
    But if the three subsequences have length close to maximal then their ends will be too thin for us to be able to glue them together preserving convexity”.

    The challenge is to make this vague argument more precise…

    1. Gil Kalai Post author

      Here is some idea in this direction. Lets consider f(d,n) (or f^*(d,n)).
      Think about n as much larger than d. Our argument is based on reaching the same ‘m’ after maling f(d,n/2) moves from both ends. The point is that the union of all sets in the first s+1 families where s=f(d,u) is larger than u.

      Suppose that we want to replace f(d,n/2) by SUM f(d-i, n/1000)
      which we can replace by d f(d,n/1000). If we can do it this will decrease the
      constant in the exponent n^logd. It sounds appealing. we reach n/1000 m’s then when we fix m and consider only sets containing it we reach n/1000 new m’s and we continue in all possible ways. It seems that in df(d,n/1000) we will reach many more elements of our original ground set unless there is some structure to our family that we can expolit in another way.

  10. Yury Volvovskiy

    I tried to follow Gil’s advice and think about the properties that ensure f(n)\le f(n-1) + 2f(n/2) inequality, or the f(n)\le n^{\frac{1}{2}\log_2 n } upper bound. Looking at the two proofs (the original one [see Wiki], and mine, which is more complicated but gives the upper bound a bit more explicitly) one can see that we only look at the supports of the families and therefore do not explicitly use the fact that families are disjoint (the supports aren’t). So here’s the idea:

    Let’s look at legal sequences of subsets of [n] defined inductively as follows:
    1. The only legal sequence on 0 elements is \{\emptyset\}.
    2. Any legal sequence on n-1 elements is also a legal sequence on n elements.
    3. A sequence \{S_1,\,S_2,\dots,\,S_k\} on n elements is legal if and only if
    3a) every proper subsequence is legal (there are two possible versions of this rule: the less restrictive one only requires that intervals are legal, the more restrictive – that all subsequences are. The difference can be demonstrated by the sequence \{\emptyset,\{1\},\emptyset\} which is legal in the former sense but not the latter)
    3b) if an element a belongs to every S_i then there are subsets S_i^{*}\subset S_i\setminus\{a\} such that \{S_1^*,\,S_2^*,\dots,\,S_k^*\} is a legal sequence on n-1 elements.

    The n^{\frac{1}{2}\log_2 n } upper bound for the length of a legal sequence is proved by the exact same argument. The question is if it’s possible to construct a legal sequence of super-polynomial length.

    1. Yury Volvovskiy

      Here’s a quadratic example in the case where only intervals are required to be legal:
      \emptyset, 1, \emptyset
      \emptyset, 1,  12, 12,  2,\emptyset
      \emptyset, 1, 12, 123, 123, 123, 123, 23, 3,\emptyset
      \emptyset, 1, 12, 123, 1234, 1234, 1234, 1234, 1234, 1234, 1234, 234, 34, 4,\emptyset
      The length of such a sequence is (n+1)(n+2)/2.

      1. Paco Santos

        To answer Gil’s question: “Do you have an example (of sets) showing that f(n) is at least (n+1)(n+2)/2?”

        The answer is “No”. What Yury has is a more general model, hence giving rise to a new function y(n) which is guaranteed to be at least as big as f(n). What Yury’s example shows is that y(n) \ge (n+1)(n+2)/2, which does not directly say anything about f(n).

        But the hope is that it might be easy(-er) to analyze (and maybe find polynomial upper bounds for) y(n) than f(n), since Yury’s model is based in forgetting -at least partially- the interaction between the different elements used in each level.

        I have to say I was very excited when I read his posts. At first it looked obvious to me that y(n) \le y(n-1) +n. This lasted until I tried to actually prove this inequality…

      2. Gil Kalai Post author

        Dear Yury and Paco, I am not sure I understand what precisely y(n) is. And also why f(n) is smallet or equal than y(n). Can you explain again?

      3. Yury Volvovskiy

        Dear Gil,

        Let me try to give a formal definition. Let R_n be a collection of sequences of sets \{S_1,S_2,\dots,S_k\}, \left|\cup S_i\right|\le n, satisfying following conditions:
        1. R_0 consists of the only sequence \{\emptyset\}.
        2. Convexity: S_i\cap S_j\,\subset S_k for i<k<j
        3. R_{n-1}\subset R_n.
        3'. If a sequence \{S_1,S_2,\dots,S_k\} is in R_n, but \left|\cup S_i\right|< n then it is also in R_{n-1}.
        4. Any subsequence of a sequence in R_n is also in R_n.
        5. Induction: if there's an element a common to all the sets in a sequence \{S_1,S_2,\dots,S_k\} in \{S_1,S_2,\dots,S_k\}, then there must exist sets S_i^*\subset S_i\setminus\{a\} such that \{S_1^*,S_2^*,\dots,S_k^*\} is in R_{n-1}.

        Then y(n) is defined as the maximal length of a sequence in R_n. It is at least as large as f(n) because given a convex sequence of families of subsets of [n] the sequence of the supports of these families is in R_n.

        By definition y(0)=1. The upper bound y(n)\le n^{\frac{1}{2} log_2 n } is obtained by the argument from the Wiki : one uses properties 3 and 4 to separate opening and closing subsequences on n/2 elements, then uses convexity (property 2) to show that the subsequence in the middle has a common element and then uses induction (property 5) to show that its length is bounded by y(n-1) recovering the estimate y(n)\le y(n-1)+2y(n/2).

    2. Paco Santos

      1) I have been thinking about Yury’s model and I think the following is true: “there is no loss of generality in assuming that the intervals where the different elements are active are never properly nested to one another”.

      The proof is as follows (please Yury, check if I understood things correctly): suppose we have two elements i and j such that i appears strictly earlier than j and disappears strictly later than j (this is what I mean by “properly nested”). Let t_j and t_i denote the last levels where i and j are used. Then it seems to me that changing i to j in all the levels from t_{j+1} to t_i still produces a valid sequence, since the restriction of it to either i or j is a subsequence (actually, an interval) of the original one restricted to i, and for the restrictions to the rest of elements we can use induction.

      2) If (2) is true it means the following: there is a natural order of the elements such that they all appear and disappear according (weakly) to this order. I say weakly because two elements may appear and/or disappear at the same time.

      3) Now, (2) implies a further simplification of the model. We do not need to remember the actual elements used in each level, but only the number of them. The sequences are no longer sequences of subsets of [n] but sequences of numbers. The actual subsets can be derived from the numbers.

      For example, the sequences in Yury’s last example become:

      What is not clear is what Yury’s axioms become in this simplified model. (In particular, it is not completely clear that my model is truly a simplification; the sequences are simpler but the definition of validity is more intricate).

    3. Paco Santos

      My part (3) is not completely true. In the same step some elements may appear and some others disappear from the support, and that is not detected by remembering only the cardinality of the support. Tu give a concrete example, the sequence [0,1,2,3,3,2,1] can represent Yury’s sequence on 3 elements [0, 1, 12, 123, 123, 23, 3] but it may also represent the following sequence on four elements: [0, 1, 12, 123, 234, 34, 4].

      But I still believe (1) and (2) are true in Yury’s model. Put differently, I think in Yury’s model there is no loss of generality (or rather, there is no loss of length) in assuming that all the S_i’s are intervals.

      1. Yury Volvovskiy

        This is easy to deal with. If an element i disappears on the same step when an element j appears in the set, we can simply add i to this set. When doing restriction by i we trim the last set down to \{j\} and vice versa.

        Another observation is that if j only appears together with i one can add j to all the sets that contain i. That’s a slightly easier way to show that the intervals are not nested – if they are, they can actually be assumed to coincide. So if two elements appear in the sequence at the same time, they also disappear at the same time.

      2. Yury Volvovskiy

        On a second thought, Paco’s way of getting rid of properly nested elements is better than mine because it survives restriction. The property of not having one elements appear and another one disappear in the same step also survives restriction.

        That gives some hope for Paco’s proposed simplification. Validity remains convoluted though: one has to say that the sequence has height n if the sum of all up-jumps is n (that corresponds to the number of elements in the original formulation). Then we need to require that any subsequence between an up-jump and the corresponding down-jump (appearance and disappearance of an element) strictly dominate a legal sequence of height at most n-1.

        Example: height 3 sequence [0,1,2,3,3,2,1] is good because subsequences [1,2,3,3] and [2,3,3,2] strictly dominate a good 2-sequence [0,1,2,1] and subsequence [3,3,2,1] strictly dominates another good 2-sequence [1,2,1,0].

      3. Gil Kalai Post author

        The (emerging) understanding that there is (without losing generality) some natural ordering on the elements, and perhaps also (that we may assume without losing generality) that this (weak) ordering is respected by restrictions look to me as giving hope for better upperbounds. (But I cannot explain why I think so.)

        (By restrictions I mean: moving to legal sequence after we deleted a common element in a certain interval. I am not sure this is what Yuri means and I am not sure if we indeed can assume that the ordering respect restrictions.)

      4. Paco Santos

        Gil said: “I am not sure this is what Yuri means and I am not sure if we indeed can assume that the ordering respect restrictions.” I think the answer to both parts is “yes”. Let me prove the second (only Yury can prove or disprove the first).

        Let us call a valid sequence of sets (valid in the sense of Yury) *monotone* if it satisfies my additional axiom that the elements appear in order and disappear in order. Put differently, if every $S_i$ is an interval of the form $[l_i, r_i]$ and both the sequences of $l_i$’s and $r_i$’s are (weakly) monotone. Of course, implicit in the definition of monotone is a prescribed ordering of the elements.

        “Respected by restrictions” means:

        Lemma: If a sequence is valid and monotone, then the valid restrictions of it with respect to every element can be chosen monotone.

        Proof. If the original sequence $S$ is monotone, the operation of “taking only the sets containing a certain k and removing k from them” gives a (perhaps not valid) monotone sequence, which I will denote $S/k$. The original sequence being valid means this monotone sequence $S/k$ contains a valid sequence $T$ (“contains” in the sense of Yuri’s axiom 5).

        Suppose $T$ is not monotone. Then, there are indices $i<j$ such that either $j$ appears or disappears before $i$. Do the following:

        – if both things happen, exchange $i$ and $j$ all throughout $T$.

        – if $j$ appears before $i$, add $i$ to all the sets from the appearance of $j$ to the appearance of $i$ (or change all $j$'s to $i$'s in those sets, both operations work).

        – if $j$ disappears before $i$, add $j$ to all the sets from the disappearance of $i$ to the appearance of $j$ (or change all $i$'s to $j$'s in those sets, both operations work).

        All these operations preserve validity and give sequences contained in the "monotone envelope" of $T$, which means they are still contained in $S/k$. Doing them on and on will eventually lead to a monotone sequence contained in $S/k$. QED

        Incidentally, in this post I stated that “monotone” was equivalent to “every $S_i$ is an interval”. That is not enough, as the sequence [2,12,23] shows. We need the extra condition that the sequences of extrema of the intervals are monotone.

      5. Gil Kalai

        This is vey interesting, Paco. As I said, having a natural ordering on the elements which is preserved under restrictions looks useful to me for improving the upper bounds. But I cannot really justify this feeling.

  11. Jason Conti

    I found a way to transform another example of f(5) that is easy to generalize to any n (of length 2n) into the example for f(5) of length 11. I think we may be able to use a variation to generate longer sequences with larger n.

    Start with the family with n symbols generated by:
    F_1 = {}
    F_i for 2 >= i >= 2n
    * contains {j, k} for j + k = i, 1 <= j, k = i

    For n = 5:
    [{}, {1}, {12}, {2, 13}, {23, 14}, {3, 15, 24}, {34, 25}, {35, 4}, {45}, {5}]

    [{}, {1}, {12}, {2, 13}, {23, 14}, {3, 15, 24}, {34, 25}, {345}, {45}, {5}]

    Add({35}, 7)
    Remove({25}, 7)
    [{}, {1}, {12}, {2, 13}, {23, 14}, {3, 15, 24}, {34, 35}, {345}, {45}, {5}]

    Add({25}, 5)
    Swap(5, 6)
    [{}, {1}, {12}, {2, 13}, {3, 15, 24}, {23, 14, 25}, {34, 35}, {345}, {45}, {5}]

    Insert({235, 4}, 7)
    [{}, {1}, {12}, {13, 2}, {15, 24, 3}, {14, 23, 25}, {235, 4}, {34, 35}, {345}, {45}, {5}]

    The reason I think this may help, is that as n gets larger, more opportunities open up to add additional families. For n = 7, the procedure above can be performed on both ends of the sequence, yielding a sequence of length 16:
    [{}, {1}, {12}, {123}, {13, 23}, {134, 2}, {14, 25, 34}, {15, 24, 3}, {17, 26, 35, 4}, {37, 46, 5}, {36, 45, 47}, {457, 6}, {56, 57}, {567}, {67}, {7}]
    (I still think an example of f(6) of length 14 exists, but the above procedure didn’t quite work) My hope is that this can be improved on as n gets larger, by including longer sets in the families.

    Another idea, that is somewhat related, is to take a sequence with length > 2n, duplicate, mirror, replace the symbols with a disjoint set and remove the empty set family, and then joining them together with a sequence of families containing symbols from both. This is related if one considers a general form of sequences of length 2n + 1:
    [{}, {1}, {12}, {13, 2}, {15, 24, 3}, {14, 23, 25}, {235, 4}, {34, 35}, {345}, {45}, {5}, {56}, … , {(n-1)n}, {n}]
    Duplicate, mirror and join it with the family {n(n+1)}, and the interior families can be converted to a set of families that is similar to the initial sequence at the start of this post (using the leftover subsets), that may be possible to extend with the above operations (working on an example for f(14), but got sidetracked).

    Using the example for f(5) of length 11, and just the basic family {n(n+1)}, the above will yield a sequence length of 11n/5, for n=5,10,20,40, …, which really isn’t great. Perhaps modifying the end of the first sequence, the start of the second sequence, and a clever choice of joining families that increase with n each iteration could improve it.

  12. Gil Kalai Post author

    This comment is in reply to Yury’s explanation of the new even more abstract version which abstrat the properties of supports of our families {\cal F}_1,\dots,{\cal F}_t.

    Dear Yuri, I see, this is very interesting!! I will also think about the version of Paco. This looks like it should be an easier question. (But we often had this feeling before.)

  13. Yury Volvovskiy

    I discovered that y(n) can be larger than (n+1)(n+2)/2. For n=5 there’s a sequence of length 22. Here it is
    The reductions are:
    1: \emptyset,2,\emptyset,3,34,34,345,345,45,45,5,\emptyset
    2: \emptyset,1,13,134,134,134,1345,1345,1345,345,345,345,45,5,\emptyset
    3: \emptyset,1,12,124,124,124,1245,1245,1245,245,245,245,45,5,\emptyset
    4: \emptyset,1,12,12,123,123,1235,1235,235,235,235,235,35,5,\emptyset
    2: \emptyset,1,12,12,123,123,23,23,234,234,34,34,4,\emptyset.
    That’s sort of interesting since I was for some reason pretty sure that the quadratic upper bound should work. It still might but I’m not so sure anymore.

    Another question (I didn’t have time to think about it) is whether it’s possible to give an upper bound for y(n) in terms of f(p(n)), where p(n) is a polynomial.

      1. Yury Volvovskiy

        Yes, that’s the version I’m using – somehow it seems more convenient, although I don’t think there’s an essential difference between the two. The (n+1)(n+2) upper bound appeared in the context of that version too.

  14. Paco Santos

    I am not sure this helps, but in Yury’s model there is an analogue of d-uniform sequences: Define collections R_{d,n} rather than R_n, and modify the induction axiom 5 to read something like “The restriction of a sequence in R_{d, n} to any element $k$ is a sequence in R_{d-1,n-1}“. Put also the additional axiom “6. Every $S_i$ has at least d elements”. (This extra axiom may not be strictly necessary to get something sensible, but it seems natural).

    If we denote y(d,n) the maximal length of sequences in R_{d,n} so obtained it is pretty easy to show that:

    y(1,n)=n (if d=1, no element can appear in two different S_i’s; on the other hand, the sequence [1, 2, 3, …, n] is valid).

    2n-4 \ge y(2,n) \ge  2n-3. The upper bound is as in the f(d,n) case and the lower bound follows from the sequence: [12, 123, … , 123… n-2, 123 … n-1 n, 123 … n-1 n, 3…n-1 n, … , n-2 n-1 n, n-1 n]. I think that y(2,n) = 2n-4.

    sequence so obtained it is pretty easy to show that y(1,n)=n

    1. Gil Kalai Post author

      This looks helpful. I suppose we would like to think about strategies for proving better upper bounds for y(n) and y(d,n), perhaps even proving that y(d,n) is at most d(n-1)+1 (say) or just a similar bound for y(n) (or for similar functions extended to monomials/multisets; which we did not look at) .

  15. Gil Kalai Post author

    It certainly looks to me that the best way to bound y(n); y(d,n) would be some clever direct combinatorial argument. (But I dont know what it will be.)

    Still going back to the linear-algebraic suggestion and Nicolai’s basic example, we may think of something like that: Yury and Paco defined (reccursively) what is a d-legal sequence of an n element sets; we have an example of a d-legal sequence of subsets of {1,2,…,n} and we want to bound its length t.

    (A reminder: the translation from families to such d-legal sequences is done by looking at the supports of the ith families i=1,2,…,t. Yury formulated a set of axions for these supports which still leads to the upper bound that we knew, and this setting look much more abstract and general, and Yury also showed some examples.)

    We also may assume (I think this is what Paco demonstrated) and use that the ordinary ordering 1,2,…, n is compatible with the ordering of this legal sequence and all restrictions.

    So we start with n variables x_1,…,x_n and consider n linear combinations of the x_i’s
    called y_1,…,y_n

    We consider the d(n-1)+1 special monomials in the x_i s that came from Nicolai’s example; just all monomials of degree d involving 2 consequtive variables. Lets call them N-monomials.

    The claim we would like to have is this: we can chose degree d monomials , the ith supported on the variables corresponding to S_i such that expressed in terms of the N-monomials (and neglecting all other terms) we get linearly independent polynomials.

    Somehow, I feel that taking the matrix transforming the x_i sto the y_j s to be triangular (w.r.t. the natural ordering of the x_i s that we talked about) will give an inductive argument for linear independence a better chance.

  16. Gil Kalai Post author

    We had a few days of silence. Personally, I am very encouraged by the new level of abstraction that Yury considered and the subsequent observations by Yury and Paco and I hope some people are thinking about it. I plan not to wait for 100 comments on this post but rather to write this weekend a new post briefly describing these developments.

    1. Paco Santos

      As Gil, I feel quite optimistic that Yury’s simplified model might lead to something new. One idea that comes to my mind is to try to understand the “maximal” sequences [S_1,…,S_t] valid in Yury’s context. By maximal I do not mean that $t$ is maximal (within the sequences with a given number $n$ of symbols) but rather that no element can be added to any $S_i$. Observe that inserting an element $k$ to an $S_i$ “helps” the restriction with respect to every element other than $k$ to be valid, so the only problem for insertion is the restriction to $k$ itself.

      For example, some of the things Yury and I have said so far imply:

      1) In a maximal valid sequence, if the symbols {1, 2,…,n} are permuted so that they appear in weak monotone order (that is, the first appearance of each $k$ happens before or at the same time as that of $k+1$) then the symbols also disappear in order. That is, every maximal sequence is “monotone” in the sense of this post (modulo permutation of the symbols). Even more so, if two elements appear at the same time in a maximal sequence then they also disappear at the same time, and vice versa.

      2) In a maximal valid sequence each $S_i$ either contains or is contained in the next $S_{i+1}$. This follows from this argument of Yury.

      As a corollary from (1) and (2), a maximal valid sequence can be completely recovered from the sequence of cardinalities of the $S_i$’s.

      One problem with this approach is that it is not clear whether maximality is preserved by restriction. That is: can the restrictions of a maximal sequence with respect to all the elements be taken maximal? Probably not…

  17. Yury Volvovskiy

    I find it convenient to work in the world according to Paco’s simplification where you replace a set by a single number, the cardinality of the set. That allows a geometric interpretation where instead of a sequence you consider the graph of the cardinality function N(k), which is the cardinality of the kth set of the sequence. The restriction (induction) condition means that you can nest smaller graphs inside the big one.

    It looks like it could be interesting to study symmetric graphs. For instance, my first example for n=4
    in Paco’s notation would look like
    All 4 restrictions could be made the same, given by the sequence 0,1,2,3,3,3,3,2,1,0.

    Here’s a bit more convoluted example: consider a legal sequence
    s= \left\{0,1,2,2,3,4,5,5,5,5,5,5,5,4,4,4,4,3,2,1,0\right\}
    of height 5 and length 21 and another sequence
    S = \left\{0,1,2,3,3,4,5,6,6,\dots,6,6,5,4,3,3,2,1,0\right\}
    of height 6 and length 29. The interesting thing is that all the restrictions of S can be chosen to be either s or its mirror image.

    If I’m right (I haven’t fully convinced myself yet) a similar construction exists for larger heights and it improves the lower bound from n^2/2 to 5n^2/8.

    1. Yury Volvovskiy

      To give a bit more color: S is the smallest sequence that covers s. It starts with 0,1, then it grows every time s grows and stays level when s stays level or drops. The first time S drops is when s ends (that corresponds to the first element leaving the support). Then we make our descent from 5 back to 0 symmetric to ascent from 0 to 5 in the beginning of the sequence.

  18. Paco Santos

    On the path of simplifying the model again and again, but keeping the proof of 2^{O(\log^2 n)} valid, I would propose Yury’s axiom 5 (recursion) to read simply:

    5′. Induction: if there’s an element common to all the sets in a sequence, then the length of the sequence does not exceed the maximum length of a sequence on n-1 elements.

    Put differently, the interval on which a certain $k$ is active cannot be longer than $y(n-1)$.

    1. Paco Santos

      That might be the case, but it would be interesting. In a sense, what we are doing with these iterated generalizations/simplifications is exploring the “limit of abstraction”. What I like about the new model is it seems more suited to computer experimentation; to compute s(n+1) (if I may denote this way the function obtained with this model) you do not need to remember all the valid sequences you got for s(n), just the actual value of s(n).

    2. Yury Volvovskiy

      It seems to allow a rather trivial superpolynomial construction. The sequence of length s(n-1) whose every element is [n] is legal. So is the sequence of length s(2n) whose first s(n-1) elements are [n] and the rest are [2n+1]. Finally by adding to this sequence s(n-1) elements of the form [2n]\setminus [n] we get a legal sequence on 2n+1 elements of length s(2n)+s(n-1). Am I missing something?

      1. Paco Santos

        I think you are right.

        At least this clarifies a point. In my attempts to prove that your y(n) is polynomial I think I have always been using your axiom 5 in my simplified form. Now I know that this could not work…

  19. Nicolai Hähnle

    I’ve been trying to wrap my head around Yury’s proposed y(n)-abstraction. I think I understand it, but let me just write out some things that have been bothering me. In particular, given a sequence of subsets of [n], how does one check whether that sequence is valid? Let me try to write it out in some form of pseudo-code.

    Input: sequence S_1, …, S_t, U = \bigcup S_j
    Output: Yes or No

    1. If U is empty (corresponding to n=0), output Yes if t \leq 1, else output No.
    2. If the sequence is not convex, output No.
    3. For all proper sub-sequences, perform the check recursively. If any of the recursive checks return No, return No.
    4. If there is an element x common to all sets S_j, then create T_j = S_j \setminus \{ x\}. For all sequences that can be obtained by taking (not necessarily proper) subsets of T_1,\ldots,T_t, perform the check recursively. If all of these recursive checks return No, return No.
    5. If we reach this point, all tests have passed and we return Yes.

    It seems that instead of checking all possible combinations of subsets in step 4, we can “annotate” sequences of sets by the reduction induced by each element. Then we can simply check in step 4 whether the “annotated reduced sequence” is valid. This simplifies the checks a lot.

    Is it correct that those reduced sequences can be chosen to be “commutative”? What I mean by this is that it should not matter whether we do “induction” first on x, then on y, or first on y, then on x.

    I was thinking whether the linear algebra approach could work for this new abstraction. It seems much more reasonable to hope that we can associate some object to each element of the sequence, and the quadratic bound makes me think of associating matrices to the sets in the sequence.

    I’ve tried some approaches of associating matrices with sets in the reduced sequences, and then combining those matrices somehow to get to matrices associated with the sets in the entire sequence. I would like to then find some statement of the form: if I have a non-trivial combination of the matrices yielding 0, then the same should be true for one of the reduced sequences. I was not successful so far, perhaps because I didn’t find a good way to use convexity.

    1. Paco Santos

      A couple quick comments:

      – In point 3, I think you do not need all the sub-sequences. It seems to me it is enough to check the maximal sub-sequences containing each x, plus checking that you don’t have too many occurrences of the empty set. (Too many means: only one empty set if you take the full “sub-sequence” axiom, no two consecutive empty sets if you only take the “sub-interval” axiom).

      – Concerning “Is it correct that those reduced sequences can be chosen to be “commutative”? What I mean by this is that it should not matter whether we do “induction” first on x, then on y, or first on y, then on x.” I think the answer is: If you assume commutativity you recover the original model that gives f(n). My argument: if you assume commutativity, for each subset S\subseteq[n] you can define the “active interval” of $S$ to be the interval where the sequence annotated to $S$ happens. Define then $R_i$ to be the family of maximal subsets $S$ that are active at time $i$. The $R_i$’s so obtained form a convex sequence of families of sets, except a priori some set $S$ may appear repeated in two different $R_i$’s. Now, if that happens, then the sequence annotated to that set $S$ contains the empty set two (or more) times, which is impossible (with the strong form of the sub-sequence interval).

      1. Nicolai Hähnle

        Dear Paco,

        yes, I agree that far too many checks are applied in what I originally wrote.

        The observation about the “commutative” is very interesting! Your argument looks good to me. Seems like commutativity bites itself with your monotonicity property though. I think one can make a valid commutative sequence monotone while keeping the commutativity, but I believe there will be conflicts if one tries to get monotonicity in the recursions. Or can these be fixed?

        I was considering the following line of attack – which I am much less optimistic about now because of this conflict, but I’m going to write it down anyway.

        Build a directed acyclic graph on vertices s, n, n-1, …, 2, 1, t with all possible arcs going from left to right. Assign to every set an s-t-path in this DAG in the following way. The first arc is from s to the largest element of the set. The next arc is from there to the largest element in the “recursed” set, etc., until the empty set is reached, which is indicated by an arc to t.

        The nice property here is that the dimension of the space of s-t-flows is exactly the number of sets in Yury’s examples, and I was looking for ways to prove linear independence. It is clear that the same path cannot occur twice, because then we would have a way to recurse down to a sequence containing two empty sets (I am assuming the stronger formulation of all subsequences, not just subintervals). To argue that other sequences of paths are impossible I wanted to use commutativity.

        The question is whether commutativity or monotonicity is the more helpful property for proofs. Strictly speaking, commutativity must be stronger if it allows to recover f(n) (which we already know to be strictly smaller than y(n), if I remember the case n=3 correctly).

      2. Paco Santos

        You remember correctly about the $n=3$ case. We know f(3)=6, achieved by the sequence [0,1,12,123,23,3], and we know y(3)=7, achieved by the sequence [0,1,12,123,123,23,3].

        This is actually a nice example of why commutativity cannot be assumed in the y model. Starting with this length seven sequence, the only valid restriction at 1 is [0,2,23,3] and the only valid restriction at 3 is [1,12,2,0]. Restricting the first at 3 gives [2,0] while restricting the second at 1 gives [0,2].

      3. Nicolai Hähnle

        After a good sleep, I realized that my DAG idea doesn’t work in the y(n) model, even with monotone sequences. Consider the sequence [12,12,123,3], with restriction [0,2,23] at 1, [0,1,13,3] at 2, and [12,2] at 3, I would assign paths s-2-t, s-2-1-t, s-3-2-1-t, s-3-2-t, and those paths are linearly dependent.

  20. Paco Santos

    Hello everyone,

    I am afraid I can show that y(4n) \ge n y(n), which implies a super-polynomial lower bound. The exact inequalities I prove, which eventually give the one above, are:

    y(2n+2) \ge 2 y(n),
    y(2n+4) \ge 3 y(n),
    y(2n+6) \ge 4 y(n),
    y(2n+8) \ge 5 y(n),
    y(2n+10) \ge 6 y(n), …

    … and so on.

    For the first one, we simply observe that the sequence with y(n) copies of [n+1] is valid on n+1 elements, and use two blocks of it to show y(2n+2) \ge 2y(n). Since this “blocks” idea is crucial to the whole proof, let me formalize it a bit. I consider my set of 2n+2 symbols as consisting of two parts A and B of size n+1, and my sequence is [A, A, ..., A, B, B, ...., B], with a first block of A‘s of length y(n) and a second block of B‘s of the same length.

    Now, I increase my set of symbols by two, putting one in A and one in B. Then I can construct a valid sequence with *three* blocks of length y(n) each: a first block of A‘s, a second block of A\cup B‘s and a third block of B‘s.

    But if I put one more symbol to A and to B, so that I now have 2n+6 in total, I can build a valid sequence with *four* blocks of length y(n): a first block of A‘s, a second and third blocks of A\cup B‘s and a fourth block of B‘s.

    And so on…

  21. Nicolai Hähnle

    This is quite remarkable, Paco! If I am not mistaken, this gives at least y(4^k) \geq 4^{k(k-1)/2} or y(n) \geq 2^{\log(n)^2/8-\log(n)/4} if one starts at n=1. That is quite close to the upper bound.

    1. Paco Santos

      Yes; this settles the complexity of y(n) to be 2^{Theta(\log(n)^2)}.

      So I guess this means we go back to $\latex f$ and $\latex f*$, and the question is what did we learn from $\latex y$.

      One thing we learnt is that we can model $f(n)$ by Yury’s axioms together with commutativity of the restrictions. Another thing is that keeping track only of the intervals when individual elements are active will not be enough to prove polynomiality of $f(n)$.

      At least one thing is true. Something in the vein of my construction will not work for $f$, since the “blocks” will have a lot of fine structure inside and you cannot glue them to one another so freely.

  22. Gil Kalai Post author

    We can maybe formulate intermidiate problems regarding the ith shaddows of our families, namely the i sets contained in sets in family i. Suppose that all sets are of size d. Abstracting the 1 shaddow is what Yury did and there we now know we cannot improve the upper bounds. We may try larger values of i between 1 and d.

  23. Pingback: Emmanuel Abbe: Erdal Arıkan’s Polar Codes | Combinatorics and more

  24. Pingback: Polynomial Hirsch Conjecture 5: Abstractions and Counterexamples. | Combinatorics and more

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s