## Open problem session of HUJI-COMBSEM: Problem #4, Eitan Bachmat: Weighted Statistics for Permutations

This is a continuation of our series of posts on the HUJI seminar 2020 open problems. This time the post was kindly written by Eitan Bachmat who proposed the problem.

My summary: understanding of the distribution of largest increasing subsequences for random permutations, and the combinatorics of largest increasing subsequences of general permutations is among the most fascinating stories in modern mathematics. When you also add weights this is largely an uncharted territory.

## Heaviest increasing subsequences

Setup: Let X be a bounded distribution with support on an interval [1,T]. Given a permutation ${\sigma}$ of size N, an increasing subsequence is a set of indices ${i_1 such that ${\sigma (i_1)<\sigma (i_2)<\cdots<\sigma (i_k)}$. If in addition, we attach to each index ${i}$ a weight ${w(i)}$, the weight of the subsequence is the sum ${w(i_1)+w(i_2)+\cdots+w(i_k)}$. We are interested in the asymptotic behavior as ${N\rightarrow \infty}$ of ${H_X(N)}$, the random variable which is given by the weight of the heaviest increasing subsequence where ${\sigma}$ is chosen uniformly among permutations of size ${N}$ and the ${w_i}$ are sampled i.i.d from X.

A standard sub-additivity argument shows that with high probability, as ${N\rightarrow \infty}$, ${\frac{H_X(N)}{\sqrt{N}}}$ approaches a constant ${\tau (X)}$. The theorem of Vershik and Kerov on the longest increasing subsequence in a random permutation states that if X is the Dirac distribution which always takes the value ${1}$, then ${\tau (X)=2}$. From this result, we can conclude that for any ${X}$, ${\tau (X)\geq 2E(X)}$ (Take the longest increasing subsequence, disregarding the weights). Using similar elementary arguments it can be shown also that

$\displaystyle \frac{\tau (X)}{2\sqrt{E(X^2)}},\frac{2\sqrt{E(X^2)}}{\tau (X)}\leq e\sqrt{ln(T)}$

Which means that ${2\sqrt{E(X^2)}}$ is a much better estimate of ${\tau (X)}$ than ${2E(X)}$ (${\frac{\sqrt{E(X^2)}}{E(X)}}$ can be as large as ${\sqrt{T}/2}$ for distributions with support in [1,T]).

Problem 1: Is it true that for all X

$\displaystyle \tau (X)\geq 2\sqrt{E(X^2)}.$

Numerical experimentation with many distributions has failed so far to find a counter example. Even the weaker bound

$\displaystyle \tau (X)\geq c\sqrt{E(X^2)}.$

for some constant ${c>0}$ is unknown to the best of our knowledge and would also be very nice to have.

We can conjecture a stronger statement, by considering the space of all distributions (say on [1,T]) with its convex structure given by mixtures of distributions. We say that ${X=pY+(1-p)Z}$ is a mixture of ${Y}$ and ${Z}$, if for all ${t}$, ${Pr(X.

Problem 2: Is it true that ${\tau ^2(X)}$ is a concave function with respect to mixtures, i.e., for any mixture ${X=pY+(1-p)Z}$ we have

$\displaystyle \tau^2(X)\geq p\tau^2(Y)+(1-p)\tau^2(Z).$

The first problem is essentially this statement for the mixture which expresses ${X}$ as a generalized mixture of the extreme points which are Dirac measures on ${1\leq t\leq T}$. We donít have counter examples for the stronger statement either.

The third problem concerns the upper bound.

Problem 3: Is there a constant ${C}$ such that, ${\frac{\tau^2 (X)}{E(X^2)}\leq C}$ for all ${X}$ bounded. We believe this is false. In particular, consider the family of Bounded Pareto distributions with parameter ${\alpha=2}$, with density given by ${f_T(t)=c_Tt^{-3}}$ for ${1\leq t\leq T}$ and ${c_T}$ the normalizing constant. We believe that the ratio for this particular family is unbounded (and in some moral sense, thatís the only case).

Problem 4: Compute ${\tau_X}$ for ANY distribution which is not a Dirac distribution. Best chance seems to be the exponential or geometric distributions, where Kurt Johansson answers the corresponding last passage percolation problem, https://arxiv.org/pdf/math/9903134.pdf

## Motivation

The problems above are motivated by the study of the asymptotic behavior of certain finite particle (interval) systems which we call “`airplane boarding”‘. The system has a parameter ${k\geq 0}$. At any given moment, the system consists of ${N}$ closed moving intervals ${I_1,\dots,I_N}$ all of length ${k}$ which we call “passengers” and which can only overlap at endpoints. The index of an interval/passenger is called its “queue location”. The ordering of the intervals according to “queue locations” never changes, i.e., passengers cannot pass each other while “boarding”. The interval ${[0,N]}$ is called the “airplane aisle”. Each interval ${I_j}$ has a target “row” (in the airplane aisle) ${\sigma (j)}$ where ${\sigma }$ is a permutation on ${1,...,N}$ (the system can be generalized to handle more than one “passenger” per “row”). To each “passenger” we also attach an “aisle clearing time” ${w_i}$. Initially, at time ${t=0}$, all the intervals are left of ${-k}$ and in descending order of their index, for example, if ${ k=1}$, we might have ${I_1=[-3,-2]}$, ${I_2=[-5.5,-4.5]}$ etc., at any given time point, each interval/passenger, ${i}$,moves at infinite speed towards its destination “row” and if it is not blocked by other intervals to its right it positions itself with the left endpoint at ${\sigma (i)}$, i.e., occupies ${[\sigma_i-k,\sigma_i]}$. When a passenger does reach its target “row”, then it waits its aisle clearing time ${w_i}$ before leaving the system and allowing the passengers behind to advance further towards their row. The “boarding time” is the time when all intervals have left the system and the process terminates. From the description, one can check that the boarding time when ${k=0}$ is simply the heaviest increasing subsequence for ${\sigma}$ and weights ${ w_i}$, so its clear that the problems above are relevant to the analysis of “airplane boarding”. To motivate them more specifically, lets introduce one more element, a “boarding policy” which makes the boarding time into a random variable. The simplest policy, which we have already encountered, is Random boarding. In Random boarding we assume that the queue position, the row and the weights are all independent random variables. The first two are chosen uniformly, generating a uniformly random permutation and we think of the ${w_i}$ as drawn i.i.d. from a distribution ${X}$, which is the aisle clearing time distribution of all passengers as a single group. We observe that ${\tau (X)}$ is a normalized asymptotic indicator of the boarding time of the Random policy. Suppose now that we have two (or more in general) subpopulations with different aisle clearing times, say, passengers with no hand luggage and the other passengers, or passengers who need assistant or are travelling with children and the other passengers as a second group. If we denote by Y and Z the aisle clearing time of the first and second subpopulations and by p, the portion of passengers from the first subpopulation, then the aisle clearing time of the whole population ${X}$ will be the mixture ${pY+(1-p)Z}$. We use ${\tau}$ as a measure of the slowness or quickness of a group and will assume that the first group with aisle clearing time distribution ${Y}$ is the group of ‘slow passengers”, i.e., ${\tau (Y)\geq \tau(Z)}$. Given this scenario, we can naturally define two policies which are practiced by airlines. The first policy is the “Slow First” (SF) policy, where we use a uniformly random permutation ${\sigma}$, but for the weights, those of the first ${[pN]}$ passengers are drawn i.i.d. from Y and the others are drawn i.i.d. from Z. The other policy is “Fast First” (FF) which as the name suggests, draws the weights of the first ${[(1-p)N]}$ passengers i.i.d. from Z and the others from Y.

Now we come to the test your intuition portion of the post. Spoiler alert, please test your intuition before continuing to read below the fold. Assuming “normal” coach class conditions in a full occupancy airplane, ${k=4}$, with a realistic portion of slow passengers, say p=0.2, and a realistic effective aisle clearing time ratio ${\tau (Y)/\tau (Z)}$ of ${C=3}$, rank the 3 policies, Random, Slow First and Fast First, from fastest to slowest in terms of average asymptotic boarding time (ascending boarding time order).

\ FOLD \

The ordering is: Slow First, then Fast First, then Random. OK, but what happens if I choose other parameters ${k,p,c}$? We have the following result of Erland, Kaupuzs, Frette, Pugatch and Bachmat, https://arxiv.org/abs/1906.05018 \

Theorem: (SF is better than FF universally) For any parameters ${k,p,c}$, SF has lower or equal average asymptotic boarding time compared to FF.

It turns out that we donít have a universal comparison between Random and FF. On realistic parameters, FF is faster, but for some small region of parameter space Random is actually faster than FF. We are left with the comparison of SF and Random and here comes problem 2 into play. We have the following result of Erland, Kaupusz, Steiner and Bachmat https://arxiv.org/abs/2012.11246

Theorem: Slow first is universally (for all mixtures and k) faster than Random if and only if ${\tau^2 (X)}$ is concave on the space of bounded distributions.

The concavity of ${\tau^2}$ has been verified on all mixtures coming from empirical observations of aisle clearing times (say, first group is passengers with 0,2,3 luggage items and the other those with 1 or 4).

The third problem and to some extent, the first, have a somewhat different motivation. There is a very different situation where we break a distribution as a mixture and that is express line queues, where the population of customers is broken into fast customers that go to the express line and the others which go to a regular line (assume there is one). We can consider the average waiting time in queue as a function of the mixture. Such systems have a parameter that does not exist in airplane boarding, the utilization ${\rho}$. On the other hand, airplane boarding has ${k}$ which does not have an express line counterpart. In addition, in airplane boarding there may be better policies for handling the slow and fast groups (say, slow passengers in back rows at the front of the queue and slow passengers from the front rows at the back of the queue, and more complicated ones).

It has been shown, https://www.sciencedirect.com/science/article/pii/S0377221718310695 , that after proper normalizations to offset these non-mutual parameters, for any mixture, the average response time of the express line queue is approximately the square of the boarding time of the optimal policy for handling the slow and fast groups of the mixture (a complicated policy, the result does not hold for slow first nor fast first). The quality of the approximation essentially depends on the ratio ${\tau^2 (X)/E(X^2)}$. If it is universally bounded then so is the approximation ratio.

Having such a tight relation between express line queues and airplane boarding is useful, since we can use it to analyze airplane boarding via the simpler and better studied express line queues. In addition, express line queues have an exact and non trivial duality on mixtures which preserves average waiting time via a formula of Riemann which was used in his proof of the functional equation of the zeta function. The duality simplifies the analysis of express line queues on self-dual distributions, these include log-normal distributions, Bounded Pareto distributions (as a family) and distributions whose densities are given by squaring a weight 2 Hecke cusp form (as in the Taniyama-Shimura conjecture) restricted to the positive imaginary axis. The approximate relation allows us to get a non-trivial approximate duality in airplane boarding.

The post is becoming long, so I will leave you with one final insight which is at the heart of the analysis of airplane boarding. Airplane boarding is asymptotically, most naturally understood in terms of evolution in (proper) time of particle clouds in space-time. As a process (consider the passengers which sit down at approximately the same time) it is essentially a discrete version of what is known as a Congruence with respect to proper time of particles, starting with a Cauchy hypersurface. That’s the kind of stuff that Penrose studied (much more deeply) when he proved the singularity theorems that won him a share of the 2020 Noble prize in physics. Maximal length or heaviest increasing subsequences in uniformly random permutations correspond to the simplest case of geodesics in special relativity (straight lines). I will explain some other time, what all this has to do with college admissions.

This entry was posted in Combinatorics, Guest blogger, Probability and tagged . Bookmark the permalink.

### 4 Responses to Open problem session of HUJI-COMBSEM: Problem #4, Eitan Bachmat: Weighted Statistics for Permutations

1. Gil Kalai says:

Hi Eitan, thanks again for the nice post. Two questions: a) What is the relation between the problem and black holes. b) There is a rich theory of longest monotone subsequences in general permutations and, for example, the Erdos-Szekeres lemma that when n=ab+1 we always have a monotone increasing subsequence of length a+1 or a monotone decreasing subsequence of length b+1. Are there weighted analogs for this lemma and the theory?

• eitan bachmat says:

Hi Gil
One way to think of increasing increasing subsequences is as discrete trajectories in space-time with maximal sequences corresponding to geodesics and this interpretation holds also in the weighted case, with the weights acting as an index of refraction (conformal scaling factors). Erdos-Szekeres is special to two dimensions (1 space and 1 time) since in that case, the negative of a Lorentzian metric is also Lorentzian providing a duality. You can have a semi weighted version, by considering the total sum of weights divided by the length of the longest, say decreasing sequence and get a lower bound on the heaviest weight of an increasing subsequence, I actually used such types of estimates.
As noted above, the basic relation is to geodesics in space-time, not fundamentally to black holes, but black holes in the work of Penrose have been studied by following geodesics which are perpendicular to an “instanteneous slice” (Cauchy hypersurface) and the discrete analogue in 2 dimensions of this process is essentially airplane boarding. In the physics community, discrete versions of space-time are sometimes referred to as Causal Sets and they have been studied since around 1978, with the leading figure being Rafael Sorkin of the Perimeter Institute. They have also been studied indirectly via growth processes such as Polynuclear growth (PNG) which is part of the Kardar-Parisi-Zhang universality class. Physicists and mathematicians have studied these growth processes at great length and the deep second order results come from this approach with mathematically rigorous results mainly in 1+1 dimensions where the algebraic structures and connections with integrable systems play a huge role. The relativistic point of view is more heuristic and comes from the role of curvature in the second fundamental formula, but is easily seen in numerics. Sorry for being very vague, details would make the answer much longer, just trying to point to the relevant fields in math and physics.

• eitan bachmat says:

Hi Gil
One more comment, I highly recommend the Booklet of Roger Penrose
“Techniques of differential topology in relativity”, roughly 70 pages.

which leads to the proof of the singularity theorems, I found it to be the most useful source for the mathematical analysis of airplane boarding, but the singularity theorems are obviously the real reason to read it.

2. Pascal says:

“What is the relation between the problem and black holes”: since you mention Erdös-Szekeres, one relation is Szekeres himself. I think this is the same person as in Kruskal-Szekeres. In general relativity, the Kruskal-Szekeres spacetime is a so-called “maximal extension” of the exterior Schwarzchild solution.