Peter Cameron: Doing research

To cheer you up in difficult times, here is a wonderful post by Peter Cameron about doing research. And see also this entertaining post by Peter on mathematics and religion.

Peter Cameron's Blog

Probably every research mathematician has been asked the question, “How do you do mathematical research?” Some lay people think we simply figure out ways of doing bigger and bigger long multiplications. Many more people think that all the mathematics must have been discovered by now, so what are we doing?

These misunderstandings are about what we do research on rather than how we do it, the question I want to focus on here.

In 1945, Jacques Hadamard published a remarkable book, The Psychology of Invention in the Mathematical Field. He was a great mathematician himself, one of whose achievements was the proof of the Prime Number Theorem fifty years earlier. He examined his own mental processes, searched the writings of his predecessors, and sent questionnaires to many leading mathematicians and scientists of his time, including Albert Einstein.

Hadamard’s conclusion was that the process of mathematical discovery can be broken…

View original post 1,204 more words

Posted in What is Mathematics | Tagged | Leave a comment

To cheer you up in difficult times 18: Beautiful drawings by Neta Kalai for my book: “Gina Says”

In 2009 I wrote a book “Gina Says” that appeared here on the blog, about the adventures of “Gina” in the blogsphere. In 2017 the book (edited and shortened a little) appeared in “world scientific.” The most important additions were beautiful drawings contributed by my daughter Neta. Here are three of them.  More to come.

“My own heart goes to chemistry”  (Chapter 5)

“Octopus music” (Chapter 25)

anti anti anti missile missile missile missle  (Chapter 17, see also the post Chomskian Linguistics)

 

Posted in Art, Gina Says | Tagged | Leave a comment

Amazing: Simpler and more general proofs for the g-theorem by Stavros Argyrios Papadakis and Vasiliki Petrotou, and by Karim Adiprasito, Stavros Argyrios Papadakis, and Vasiliki Petrotou.

Stavros Argyrios Papadakis, Vasiliki Petrotou, and Karim Adiprasito

In 2018, I reported here about Karim Adiprasito’s proof of the g-conjecture for simplicial spheres.  This conjecture by McMullen from 1970 was considered a holy grail of algebraic combinatorics and it resisted attacks by many researchers over the years.

Today’s post  reports on two developments. The first from December 2020 is a second proof for the g-conjecture for spheres by Stavros Argyrios Papadakis and Vasiliki Petrotou. (Here is a link to the arXive paper.) A second proof to a major difficult theorem is always very very important and exciting.  The proof seems considerably simpler than Karim Adiprasito’s proof. The miracle that enables simplification was working with characteristic two in contrast to Adirasito’s proof which is characteristic free, and which therefore applies in greater generality.

However, the two teams joined forces, and in January 2021 posted a proof  (which seems even simpler) by Karim Adiprasito, Stavros Argyrios Papadakis and Vasiliki Petrotou of the g-conjecture largely extended to pseudomanifolds. (Here is a link to the arXived paper.)

I was too slow to report on these developments separately, so let me report here on both.  I am thankful to Karim Adiprasito for telling me about both results and some further conversations.

I hope that with these new proofs more people in the community will be able to absorb and understand the ideas, go carefully over the proofs, and apply the methods to a large number of outstanding problems beyond the g-conjecture. (A few of which are discussed in this post.)

December 2020: Stavros Argyrios Papadakis and Vasiliki Petrotou present a second proof of the g-conjecture.

A few weeks ago Stavros Argyrios Papadakis and Vasiliki Petrotou uploaded a paper on the arXive The characteristic 2 anisotropicity of simplicial spheres.   The paper presents a second proof for McMullen’s g-conjecture for simplicial spheres.

Stavros Argyrios Papadakis and Vasiliki Petrotou: The characteristic 2 anisotropicity of simplicial spheres

Abstract: Assume D is a simplicial sphere, and k_1 is a field. We say that D is generically anisotropic over k_1 if, for a certain purely transcendental field extension k of k_1, a certain Artinian reduction A of the Stanley-Reisner ring k[D] has the following property: All nonzero homogeneous elements u of A of degree less or equal to (\dim D +1)/2 have nonzero square. We prove, using suitable differential operators, that, if the field k_1 has characteristic 2, then every simplicial sphere D is generically anisotropic over k_1. As an application, we give a second proof of a recent result of Adiprasito, known as McMullen’s g-conjecture for simplicial spheres. We also prove that the simplicial spheres of dimension 1 are generically anisotropic over any field k_1.

(GK) The g-conjecture is about relations between face numbers of simplicial spheres. The new proof, like Karim’s proof and most earlier attacks goes through an algebraic property, the Lefschetz  property. And here this property is proved for fields of coefficients of characteristic two. The proof relies on a formula whose discovery was assisted by computer experimentation using the Grayson-Stillman computer algebra program Macaulay II.

January 2021: Karim Adiprasito, Stavros Argyrios Papadakis, and Vasiliki Petrotou present a new proof for the g-theorem extended to general pseudomanifolds, cycles, and doubly-Cohen Macualay complexes.

A few days ago Karim, Stavros Argyrios, and Vasiliki uploaded a new joint paper with yet a further simplification of the argument and far-reaching extensions to pseudomanifolds, cycles, and doubly Cohen Macaulay complexes.

Karim Adiprasito, Stavros Argyrios Papadakis and Vasiliki Petrotou, Anisotropy, biased pairings, and the Lefschetz property for pseudomanifolds and cycles

Abstract: We prove the hard Lefschetz property for pseudomanifolds and cycles in any characteristic with respect to an appropriate Artinian reduction. The proof is a combination of Adiprasito’s biased pairing theory and a generalization of a formula of Papadakis-Petrotou to arbitrary characteristic. In particular, we prove the Lefschetz theorem for doubly Cohen Macaulay complexes, solving a generalization of the g-conjecture due to Stanley. We also provide a simplified presentation of the characteristic 2 case of the Papadakis-Petrotou formula, and generalize it to pseudomanifolds and cycles.

A few more comments on both papers

A crucial ingredient in these works is the need to understand the behaviour of Stanley-Reisner ring w.r.t. generic systems of parameters. Finding the right tools to use “genericity” is a crucial question. Adiprasito considered in his 2018 paper the Hall-Laman property which is related to the connection with framework rigidity and the notion of generic rigidity. This property is crucial (under somewhat smaller generality) in the newest APP paper.  It is also closely related to the generic anisotropicity property studied by Papadakis and Vasiliki Petrotou.

Exploring the use of genericity and the connections with rigidity in the study of face numbers go back to works from the 80s and 90s by Stanley, Kalai, Björner and Kalai, Billera, Whitely, Lee, Tay and Whitely, and others. The reference to Hall is based on  Hall’s famous marriage theorems (that has some roots also in Jacobi’s work) and its connections with generic determinants that were considered in the works of  Tutte, Edmonds, Lovàsz, Rabin and Vazirani, Karp, Upfal, and Wigderson, and others. The reference to Laman refers to Laman’s 1970 theorem that gives a characterization for planar generic rigidity (we discussed it briefly in this post on derandomization).

Another crucial idea both in Adiprasito’s 2018 proof and the Papadakis- Petrotou proof is the nondegeneracy (in a strong sense) of a Poincaré pairing. Both proofs use the property that the Poincaré pairing does not degenerate at ideals (something Karim calls the biased pairing property) as well infinitesimal deformations of the linear system of parameters. Karim told me that both proofs, however, face distinct difficulties: while his proof is geometrically difficult, and the Papadakis- Petrotou paper is restricted by the characteristic of the field. Through combining, they managed to circumvent both these difficulties!

While, the Lefschetz property  is extended to arbitrary characteristics and to very general classes of simplicial complexes, the generic anisotropicity property itself is still open beyond characteristic two. The new paper also offers a new understanding of the formula that was achieved originally by Stavros and Vasiliki with computer experimentation, and relates it to old results by Carl Lee.

I am quite happy that the new proof extends to the exciting uncharted territory of pseudomanifolds, as there are a lot we want to understand regarding their Stanley-Reisner rings. For example, it is a long dream of mine to be able to identify Goresky-MacPherson’s intersection cohomology of a simplicial pseudomanifold K, (for arbitrary perversities) through the Stanley-Reisner ring.

For earlier posts on the g-conjecture click here. See especially the guest posts by Eran Nevo.

Hilda Geiringer

An interesting historical comment is that Laman’s 1970 theorem about generic planar rigidity was proved already in 1927 by the famous applied mathematician and statistician Hilda Pollaczek-Geiringer. Here is an English exposition of her proof and some of her other works on rigidity by Brigitte Servatius.

Posted in Algebra, Combinatorics, Geometry | Tagged , , , , | 4 Comments

Igor Pak: What if they are all wrong?

To cheer you up in difficult times, here is a wonderful thoughtful, entertaining, and provocative post by Igor Pak about conjectures. See also Shmuel Weinberger’s take on conjectures.

Igor Pak's blog

Conjecturesare a staple of mathematics. They are everywhere, permeating every area, subarea and subsubarea. They are diverse enough to avoid a single general adjective. They come in al shapes and sizes. Some of them are famous, classical, general, important, inspirational, far-reaching, audacious, exiting or popular, while others are speculative, narrow, technical, imprecise, far-fetched, misleading or recreational. That’s a lot of beliefs about unproven claims, yet we persist in dispensing them, inadvertently revealing our experience, intuition and biases.

The conjectures also vary in attitude. Like a finish line ribbon they all appear equally vulnerable to an outsider, but in fact differ widely from race to race. Some are eminently reachable, the only question being who will get there first (think 100 meter dash). Others are barely on the horizon, requiring both great effort, variety of tools, and an extended time commitment (think ironman triathlon). The most celebrated third…

View original post 4,844 more words

Posted in Combinatorics, Computer Science and Optimization, Geometry, What is Mathematics | Tagged | 5 Comments

To cheer you up in difficult times 17: Amazing! The Erdős-Faber-Lovász conjecture (for large n) was proved by Dong Yeap Kang, Tom Kelly, Daniela Kühn, Abhishek Methuku, and Deryk Osthus!

Dong Yeap Kang, Tom Kelly, Daniela Kühn, Abhishek Methuku, and Deryk Osthus have just uploaded a paper to the arXive, A proof of the Erdős-Faber-Lovász conjecture. (I am thankful to Nati Linial and Ryan Alweiss for telling me about it.)

Here is one formulation of the conjecture:

Erdős-Faber-Lovász Conjecture: Let F be a family of k-subsets of \{1,2,\dots,n\}. Suppose that every two sets in F have at most one element in common, then F can be divided into at most n families so that every two sets in each of these families are disjoint.

Using some standard hypergraph jargon the conjecture can be stated as follows:

Erdős-Faber-Lovász Conjecture: The chromatic index of any linear hypergraph on n vertices is at most n.

Here a linear hypergraph is a hypergraph where every two edges share at most one vertex. The chromatic index of a hypergraph is the number of colors needed to color the edges so that  every two intersecting edges have distinct colors.

The new result asserts

Theorem  (Kang, Kelly, Kühn, Methuku, and Osthus): For every sufficiently large n, every linear hypergraph H on n vertices has chromatic index at most n.

A stability versions of this result, which confirm a prediction of Jeff Kahn, is also provided. Of course, it will be interesting to settle the conjecture for all values of n.

This is an amazing breakthrough! Congratulations!

(from Wikipedea)

 

Daniella Kühn, Deryk Osthus, Tom Kelly, and Abhishek Methuku  (I will add a picture of Dong Yeap Kang when I will find Update: found! see below.)

A little more

Background and links: Here is the Wikipedea article. We mentioned the conjecture in this post and this one.

Matching and Fractional version. In 1982 Paul Seymour proved that any linear hypergraph H has a matching of size e(H)/n. In 1992 Jeff Kahn and Paul Seymour proved that the fractional chromatic index of any linear hypergraph on n vertices is at most n.

Approximate version via the semi random method. In 1992 Jeff Kahn proved that chromatic index of any linear hypergraph on n vertices is at most n(1+o(1)). Kahn used the Rödl nibble (aka semi-random) method. Kahn extended the result to list-coloring.

Open problems

There are many related open questions and some are mentioned in the paper. It will be interesting to prove EFL conjecture for all values of n. EFL-conjecture extends to  beautiful generalization of Vizing’s theorem which is still open. Also the EFL-conjecture for list coloring is still open. See also this 1994 article by Jeff Kahn on Hypergraphs matching, covering and coloring problems.

Another interesting problem is:

The restricted Larman’s conjecture: Let F be an intersecting family of k-subsets of \{1,2,\dots,n\}.  Then F can be divided into at most n 2-intersecting families, namely so that every two sets in each of these families have at least 2 elements in common.

Little Update: A few more details on the last problem. As far as I know for the restricted Larman conjecture it is not even known that F can always be divided into (1+o(1)) 2-intersecting families.

Now, call G strongly 2-intersecting if the intersection of all sets in G has at least two elements.  Furedi and Seymour proved that F can be fractionally covered by n strongly 2-intersecting subfamilies. They conjectured that F can always be covered by n strongly 2-intersecting families, however Jeff Kahn and I disproved this conjecture.

Posted in Combinatorics, Updates | Tagged , , , , , | 2 Comments

Open problem session of HUJI-COMBSEM: Problem #5, Gil Kalai – the 3ᵈ problem

This post continues to describe problems presented at our open problems session back in November 2020. Here is the first post in the series.  Today’s problem was presented by me, and it was an old 1989 conjecture of mine.

A convex set K is centrally symmetric if x \in K implies that -x \in K.

Conjecture: Let P be a centrally symmetric d-polytope. Then P has at least 3ᵈ non-empty faces.

Here we count P itself as a face.

Before we continue let me mention a winter school on “The Interplay between High-Dimensional Geometry and Probability”, January 11-15. Among other exciting lectures Bo’az Klartag will give 3 lectures on Yuansi Chen’s work on the KLS conjecture, that we discussed in this post.

Now back to the 3ᵈ conjecture. A case of equality is the d-dimensional cube C_d. The faces of C_d corresponds to (-1,1,*)-vectors of length d. Given such a vector it corresponds to a face of the cube whose vertices are obtained by replacing the *’s with ±1 in all possible ways.  Of course the dual of C_d also has 3^d non-empty faces.

 

Schlegel diagram of the octahedral prism, a 4-dimensional Hanner polytope.  (attribution: Robert Webb’s Stella software as the creator of this image along with a link to the website: http://www.software3d.com/Stella.php. via Wikipedea.)

Some comments

Here are some points raised in the discussion and further comments.

  1. There are many cases of equality – all Hanner polytopes. Those are obtained from intervals by taking the two operations, a) Cartesian product, b) moving from a polytope P to its polar-dual P*, and apply them in all possible ways.
  2. Nati asked if a (2+\epsilon)^d lower bound is known. To the best of my knowledge even this is not known.
  3. Gideon asked about the connections with Mahler’s conjecture that for every d-dimentional centrally symmetric body K, vol(K)vol(K*) \ge 4^n/n!  . Indeed the cases of equality are conjectured to be the same – the class of Hanner polytopes. (Mahler’s conjecture and its relations to combinatorics would deserve a separate post.) The recent paper Flag numbers and floating bodies by F Besau, C Schütt, EM Werner looks very relevant.
  4. For simplicial polytopes the conjecture follows from results of Stanley (1986) and is related t0 a result by Barany and Lovasz (1982).
  5. The conjecture is related to a 1979 theorem of Figiel, Lindenstrauss, and Milman that asserts that for every centrally symmetric d polytope \log f_0(P) \log f_{d-1}(P) \ge \gamma d, for some universal constant \gamma.
  6. A somewhat more general conjecture would be that the number of k-faces of the barycentric subdivision of K is always as large as the number of k-faces of the barycentric subdivision of C_d. The case k=0 is the original conjecture and I regard the case k=d-1 (sometimes referred to as the “flag-number conjecture”) as even closer to Mahler’s conjecture. See the paper by Schmidt and Ziegler, Ten Problems in Geometry.
  7. In this paper by Raman Sanyal, Axel Werner, and Günter M. Ziegler, the conjecture is proved for d \le 4. The proof is rather involved and rely on an extension of the lower bound theorem. (It does not extend to polyhedral spheres.)  The paper also gives counterexamples for two much stronger conjectures that I made in my 1989 paper.

Some further comments

Continue reading

Posted in Combinatorics, Computer Science and Optimization, Geometry, Open problems | 5 Comments

To cheer you up in difficult times 16: Optimism, two quotes

A timely quote

“If there’s so much shit around, there has to be a pony somewhere.”

Arundhati Roy, The God of Small Things

and

Give me a couple of years free from other duties, and I shall complete the task – the task of copying the whole book of nature, and of writing the new science

Francis Bacon, according to Karl Popper

 

Francis Bacon drawn by Neta Kalai

The hard choice between family, teaching and committee work on the one hand, and writing the whole book of nature on the other

Blogged again (original 2008, and Chapter 23 in “Gina Says“), to cheer you up in difficult times.

There has to be a pony somewhere

From Arundhati Roy’s book “The God of Small Things”

Here is a tale from Arundhati Roy‘s book “The God of Small Things”.  In the book Margaret Kochamma tells the following joke to Chacko in Oxford, England, where the two meet: A man had twin sons… Pete and Stuart. Pete was an Optimist and Stuart was a Pessimist… On their thirteenth birthday their father gave Stuart – the Pessimist – an expensive watch, a carpentry set and a bicycle… And Pete’s – the Optimist’s – room he filled with horse dung… When Stuart opened his present he grumbled all morning. Continue reading

Posted in Art, Poetry | Tagged , , | Leave a comment

The Argument Against Quantum Computers – A Very Short Introduction

Left: Gowers’s book Mathematics a very short introduction. Right C. elegans; Boson Sampling can be seen as the C. elegans of quantum computing. (See, this paper.)

Update (January 6, 2021): Tomorrow January, 7, 8:30 AM Israel time, I give a lecture on this topic at the CQT (Center for Quantum Technologies) Thirteenth Anniversary meeting in Singapore. The entire program looks exciting and my own lecture is at 8:30 AM Israel Time. Registration is open and free. Here are the slides.

Happy new year to all my readers.

Today, I will briefly explain the main argument in my theory regarding quantum computers. In this post I will use the term HQCA (“Huge Quantum Computational Advantage”) instead of “quantum supremacy” that Google uses and “quantum advantage” that IBM and  Jiuzhang Boson Sampling group use. Before we move to the main topic of the post, here are two related updates.

Boson Sampling update

A group of researchers mainly from USTC in Hefei, China, claimed to achieve a fantastic quantum computational advantage using Jiuzhang, a photonic device for “Boson Sampling”.  It seems plausible now that level-k approximation from my paper Gaussian Noise Sensitivity and BosonSampling with Guy Kindler will “spoof” the claims, namely, will show an efficient way to give samples of similar fidelity on a digital computer. (There are other related approaches for spoofing the claim, mainly one by Jelmer Renema, and there are interesting connections between Jelmer’s approximations and ours. Sergio Boixo, John Martinis, and other researchers also raised concerns regarding the new HQCA claims.) While it will probably take several weeks or months to clarify it further, it is already clear, that in view of Kindler and mine 2014 results, the claims of achieving in 200 seconds samples that require millions of computing years on current supercomputers are unfounded.

For more details  see my post about the recent HQCA Boson Sampling experiment, and also Scott Aaronson’s recent post.

Scott Aaronson’s apology

Let me mention that Scott Aaronson’s recent post issued the following apology addressed to me, which I gladly welcome:

The argument against quantum computers: a very short introduction

Our starting point is the following recent assertion by Scott Aaronson:

Many colleagues told me they thought experimental BosonSampling was a dead end, because of photon losses and the staggering difficulty of synchronizing 50-100 single-photon sources. They said that a convincing demonstration of quantum supremacy would have to await the arrival of quantum fault-tolerance—or at any rate, some hardware platform more robust than photonics. I always agreed that they might be right. — 

My argument for why all attempts to reach HQCA as well as quantum error-correction via NISQ systems will fail  is based on two claims.  

A) “A convincing demonstration of HQCA would have to await the arrival of quantum fault-tolerance”

B) The error rate required for quantum error correcting codes needed for quantum fault tolerance is even lower than that required for HQCA.

If claims A) and B) are both correct, then it is not possible to reach HQCA as well as good quality quantum error-correction via NISQ systems. In this case, all attempts to reach HQCA as well as quantum error-correction via NISQ systems will fail.

As Scott Aaronson himself asserted, many people in the quantum computing community agree (or used to agree) with claim A). And there is also wide agreement and experimental evidence for claim B). But fewer people see the logical consequence of claims A) and B) put together. 

A little more

The basis for Claim A

We need to put claim A on theoretical grounds. And indeed it is based on a computational complexity assertion for the computational power of NISQ systems, PLUS a heuristic principle (“naturalness”)  on the relation between asymptotic statements and the behaviour of intermediate scale systems.   

My work with Guy Kindler shows that for constant level of noise, NISQ systems represent very low-level computational power that we call LDP. (We studied specifically BosonSampling but the argument extends.)

The naturalness principle is quite common when you relate theory to practice: E.g., when Scott Aaronson and  Sam Gunn argue  that “spoofing” the Google statistical test for 53 qubits is probably very hard for digital computers, they make a similar leap from asymptotic insights to hardness insights for intermediate-scale systems.

Non-stationary and chaotic behaviour

There are several facts that strengthen the argument for claim A. Let me mention one: We already mentioned that for constant level of noise, NISQ systems represent very low-level computational power. Our work also asserts that for a wide range of lower levels of noise the empirical distribution from sampling based on NISQ systems will be very noise-sensitive: we can expect non-stationary and even chaotic empirical outcomes. This is a great thing to test empirically!

But how is it that classical computation is possible?

This is a beautiful piece of the puzzle. The very low computational complexity class describing the power of NISQ systems does support a very rudimentary form of classical error-correction. This is the reason that robust classical information is possible so I can write  this post for you to read. (See also this cartoon post from 2017.)

Topological quantum computing (4 Nick)

Topological quantum computing is a proposed alternative path to very stable quantum qubits and quantum computing. Accepting claims A) and B) does not directly imply that topological quantum computing will fail, but there are good reasons that the argument does extend.

Was HQCA achieved already?

There are two papers claiming it was! One describing the Sycamore Random Circuit Sampling experiment was published in Nature in 2019, and one describing the Jiuzhang Boson Sampling experiment was published in Science in 2020.  I  find both these claims unconvincing.


An Open Problem

One question that certainly comes to mind is whether the technological advances represented by Jiuzhang and earlier on by Sycamore and by other superconducting and ion trapped  NISQ systems, could have technological and scientific fruits even without achieving a computational advantage.

This  is especially interesting if my theory is correct, but it is also interesting if it is not.

More information

A  few of my papers and lectures

The argument against quantum computers, Quantum, Probability, Logic: Itamar Pitowsky’s Work and Influence (M. Hemmo and O. Shenker, eds.), pp. 399–422, Springer, 2019. (Section 3.5 refers to topological quantum computing.)

Three puzzles on mathematics computations, and games, Proc. ICM2018;  

The quantum computer puzzle, Notices AMS, May 2016

Boson Sampling was studied in a 2014 paper by Guy Kindler and me Gaussian Noise Sensitivity and BosonSampling. Our study of Boson Sampling gives the basis to understanding of general NISQ systems.

The argument against quantum computers, the quantum laws of nature, and Google’s supremacy claims, The Intercontinental Academia Laws: Rigidity and Dynamics (M. J. Hannon and E. Z. Rabinovici, eds.), World Scientific, 2020. arXiv:2008.05188.

Slides from my 2019 CERN lecture.

Videotaped lectures

A recent colloquium at Harvard

My ICM 2018 videotaped lecture;  A very easy-going videotaped zoom lecture at USTLC.  (slides)

Eralier posts My collegial supremacy FAQ; Quantum matters.

Boson Sampling as the C. Elegans of quantum computing

Caenorhabditis elegans, or C. elegans for short, is a species of a free-living roundworm whose biological study shed much light on more complicated organisms. In my paper in Itamar Pitowsky’s memorial volume I regard BosonSampling as the C. elegans of quantum computing. Several technical and conceptual aspects of quantum computation are becoming much clearer for this model.

A remark about the terminology

John Preskill’s term “quantum supremacy” is problematic because the word “supremacy” (e.g. “the supremacy of the king”) often refers to superiority or dominance across the board, which is not the case for quantum computation. It is also unfortunate in terms of its connotations. I agree with Preskill that “advantage” is too weak e.g. when it is claimed that the new Boson Sampling experiment can achieve in 200 seconds what would take an ordinary supercomputer millions of years, this is not merely an “advantage.”  This is why I propose “huge quantum computational advantage” (HQCA). 

 

Robert Aumann, Benjamin Netanyahu, and quantum computers

Here is an amusing story (in Hebrew) from an interview of Robert Aumann about  quantum computers and Benjamin Netanyahu



 

Posted in Combinatorics, Computer Science and Optimization, Physics, Probability, Quantum | Tagged , | 7 Comments

Open problem session of HUJI-COMBSEM: Problem #4, Eitan Bachmat: Weighted Statistics for Permutations

This is a continuation of our series of posts on the HUJI seminar 2020 open problems. This time the post was kindly written by Eitan Bachmat who proposed the problem. 

My summary: understanding of the distribution of largest increasing subsequences for random permutations, and the combinatorics of largest increasing subsequences of general permutations is among the most fascinating stories in modern mathematics. When you also add weights this is largely an uncharted territory. 

Heaviest increasing subsequences

Setup: Let X be a bounded distribution with support on an interval [1,T]. Given a permutation {\sigma} of size N, an increasing subsequence is a set of indices {i_1<i_2<\cdots<i_k} such that {\sigma (i_1)<\sigma (i_2)<\cdots<\sigma (i_k)}. If in addition, we attach to each index {i} a weight {w(i)}, the weight of the subsequence is the sum {w(i_1)+w(i_2)+\cdots+w(i_k)}. We are interested in the asymptotic behavior as {N\rightarrow \infty} of {H_X(N)}, the random variable which is given by the weight of the heaviest increasing subsequence where {\sigma} is chosen uniformly among permutations of size {N} and the {w_i} are sampled i.i.d from X.

A standard sub-additivity argument shows that with high probability, as {N\rightarrow \infty}, {\frac{H_X(N)}{\sqrt{N}}} approaches a constant {\tau (X)}. The theorem of Vershik and Kerov on the longest increasing subsequence in a random permutation states that if X is the Dirac distribution which always takes the value {1}, then {\tau (X)=2}. From this result, we can conclude that for any {X}, {\tau (X)\geq 2E(X)} (Take the longest increasing subsequence, disregarding the weights). Using similar elementary arguments it can be shown also that

\displaystyle \frac{\tau (X)}{2\sqrt{E(X^2)}},\frac{2\sqrt{E(X^2)}}{\tau (X)}\leq e\sqrt{ln(T)}

Which means that {2\sqrt{E(X^2)}} is a much better estimate of {\tau (X)} than {2E(X)} ({\frac{\sqrt{E(X^2)}}{E(X)}} can be as large as {\sqrt{T}/2} for distributions with support in [1,T]).

Problem 1: Is it true that for all X

\displaystyle \tau (X)\geq 2\sqrt{E(X^2)}.

Numerical experimentation with many distributions has failed so far to find a counter example. Even the weaker bound

\displaystyle \tau (X)\geq c\sqrt{E(X^2)}.

for some constant {c>0} is unknown to the best of our knowledge and would also be very nice to have.

We can conjecture a stronger statement, by considering the space of all distributions (say on [1,T]) with its convex structure given by mixtures of distributions. We say that {X=pY+(1-p)Z} is a mixture of {Y} and {Z}, if for all {t}, {Pr(X<t)=pPr(Y<t)+(1-p)Pr(Z<t)}.

Problem 2: Is it true that {\tau ^2(X)} is a concave function with respect to mixtures, i.e., for any mixture {X=pY+(1-p)Z} we have

\displaystyle \tau^2(X)\geq p\tau^2(Y)+(1-p)\tau^2(Z).

The first problem is essentially this statement for the mixture which expresses {X} as a generalized mixture of the extreme points which are Dirac measures on {1\leq t\leq T}. We donít have counter examples for the stronger statement either.

The third problem concerns the upper bound.

Problem 3: Is there a constant {C} such that, {\frac{\tau^2 (X)}{E(X^2)}\leq C} for all {X} bounded. We believe this is false. In particular, consider the family of Bounded Pareto distributions with parameter {\alpha=2}, with density given by {f_T(t)=c_Tt^{-3}} for {1\leq t\leq T} and {c_T} the normalizing constant. We believe that the ratio for this particular family is unbounded (and in some moral sense, thatís the only case).

Problem 4: Compute {\tau_X} for ANY distribution which is not a Dirac distribution. Best chance seems to be the exponential or geometric distributions, where Kurt Johansson answers the corresponding last passage percolation problem, https://arxiv.org/pdf/math/9903134.pdf

Motivation

The problems above are motivated by the study of the asymptotic behavior of certain finite particle (interval) systems which we call “`airplane boarding”‘. The system has a parameter {k\geq 0}. At any given moment, the system consists of {N} closed moving intervals {I_1,\dots,I_N} all of length {k} which we call “passengers” and which can only overlap at endpoints. The index of an interval/passenger is called its “queue location”. The ordering of the intervals according to “queue locations” never changes, i.e., passengers cannot pass each other while “boarding”. The interval {[0,N]} is called the “airplane aisle”. Each interval {I_j} has a target “row” (in the airplane aisle) {\sigma (j)} where {\sigma } is a permutation on {1,...,N} (the system can be generalized to handle more than one “passenger” per “row”). To each “passenger” we also attach an “aisle clearing time” {w_i}. Initially, at time {t=0}, all the intervals are left of {-k} and in descending order of their index, for example, if { k=1}, we might have {I_1=[-3,-2]}, {I_2=[-5.5,-4.5]} etc., at any given time point, each interval/passenger, {i},moves at infinite speed towards its destination “row” and if it is not blocked by other intervals to its right it positions itself with the left endpoint at {\sigma (i)}, i.e., occupies {[\sigma_i-k,\sigma_i]}. When a passenger does reach its target “row”, then it waits its aisle clearing time {w_i} before leaving the system and allowing the passengers behind to advance further towards their row. The “boarding time” is the time when all intervals have left the system and the process terminates. From the description, one can check that the boarding time when {k=0} is simply the heaviest increasing subsequence for {\sigma} and weights { w_i}, so its clear that the problems above are relevant to the analysis of “airplane boarding”. To motivate them more specifically, lets introduce one more element, a “boarding policy” which makes the boarding time into a random variable. The simplest policy, which we have already encountered, is Random boarding. In Random boarding we assume that the queue position, the row and the weights are all independent random variables. The first two are chosen uniformly, generating a uniformly random permutation and we think of the {w_i} as drawn i.i.d. from a distribution {X}, which is the aisle clearing time distribution of all passengers as a single group. We observe that {\tau (X)} is a normalized asymptotic indicator of the boarding time of the Random policy. Suppose now that we have two (or more in general) subpopulations with different aisle clearing times, say, passengers with no hand luggage and the other passengers, or passengers who need assistant or are travelling with children and the other passengers as a second group. If we denote by Y and Z the aisle clearing time of the first and second subpopulations and by p, the portion of passengers from the first subpopulation, then the aisle clearing time of the whole population {X} will be the mixture {pY+(1-p)Z}. We use {\tau} as a measure of the slowness or quickness of a group and will assume that the first group with aisle clearing time distribution {Y} is the group of ‘slow passengers”, i.e., {\tau (Y)\geq \tau(Z)}. Given this scenario, we can naturally define two policies which are practiced by airlines. The first policy is the “Slow First” (SF) policy, where we use a uniformly random permutation {\sigma}, but for the weights, those of the first {[pN]} passengers are drawn i.i.d. from Y and the others are drawn i.i.d. from Z. The other policy is “Fast First” (FF) which as the name suggests, draws the weights of the first {[(1-p)N]} passengers i.i.d. from Z and the others from Y.

Now we come to the test your intuition portion of the post. Spoiler alert, please test your intuition before continuing to read below the fold. Assuming “normal” coach class conditions in a full occupancy airplane, {k=4}, with a realistic portion of slow passengers, say p=0.2, and a realistic effective aisle clearing time ratio {\tau (Y)/\tau (Z)} of {C=3}, rank the 3 policies, Random, Slow First and Fast First, from fastest to slowest in terms of average asymptotic boarding time (ascending boarding time order).

\ FOLD \

Continue reading

Posted in Combinatorics, Guest blogger, Probability | Tagged | 4 Comments

To Cheer You Up in Difficult Times 15: Yuansi Chen Achieved a Major Breakthrough on Bourgain’s Slicing Problem and the Kannan, Lovász and Simonovits Conjecture

Yuansi Chen

This post gives some background to  a recent amazing breakthrough  paper: An Almost Constant Lower Bound of the Isoperimetric Coefficient in the KLS Conjecture by Yuansi Chen. Congratulations Yuansi!

The news

Yuansi Chen gave an almost constant bounds for Bourgain’s 1984 slicing problem and for the Kannan-Lovasz-Simonovits 1995 conjecture. In this post I will describe these conjectures.

Unrelated cheerful news: Here is a very nice Quanta article by Kevin Hartnett on Lauren Williams, the positive Grassmannian and connections to physics. (See this related 2015 post.)

Bourgain’s slicing problem

Bourgain’s slicing problem (1984):  Is there c > 0 such that for any dimension n and any centrally symmetric convex body K ⊆ \mathbb R^n of volume one, there exists a hyperplane H such that the (n − 1)-dimensional volume of K ∩ H is at least c?

Vitali Milman wrote: “I was told once by Jean that he had spent more time on this problem and had dedicated more efforts to it than to any other problem he had ever worked on.”  

A positive answer to the problem is sometimes referred to as the hyperplane conjecture. Bourgain himself proved in the late 1980s that  the answer is yes if c=1/n^{1/4}\log n and twenty years later Boaz Klartag shaved away the logn factor. For more on the problem see this article by Boaz Klartag and Vitali Milman and for the history see the moving foreword  by Vitali Milman.

Yuansi Chen’s result (combined with an earlier result of Klartag and Ronen) asserts that c can be taken as n^{-o(1)}.

Computing the volume of convex sets and the 1995 Kannan-Lovasz-Simonovits conjecture

Kannan, Lovász and Simonovits (KLS) conjectured  in 1995 that for any distribution that is
log-concave, the Cheeger isoperimetric coefficient equals to that achieved by half-spaces up to a universal constant factor.  The crucial point here is that the constant does not depend on the dimension.

Let K be a convex body of volume one in \mathbb R^n and consider the uniform probability distribution on K. Given a subset A of K of volume t≤1/2 we can ask what is the surface S(A) area of A \cap int K. (We don’t want to consider points in the surface of K itself.)   Let f_K(t) be the minimum value of S(A) for subsets of volume t. The Cheeger constant or the expansion constant is the minimum value of f(t)/t. The KAS conjecture asserts that up to a universal constant independent from the dimension the Cheeger constant is realized by separating K with a hyperplane! (The conjecture is also called the Kannan, Lovász and Simonovits hyperplane conjecture.)

The motivation for the KMS conjecture came from a major 1990 breakthrough in the theory of algorithms. Dyer, Frieze, and Kannan found a polynomial-time algorithm to compute the volume of a convex set K in \mathbb R^d.  Kannan, Lovász and Simonovich (later joined by Santosh Vampala) wrote a series of ingenious papers where they improved the exponents of the polynomial and in the process introduced new methods for proving rapid mixing of Markov chains and new Euclidean isoperimetric results and conjectures.

As it turned out the KLS conjecture implies Bourgain’s hyperplane conjecture and also the “thin-shell conjecture” that  I will not describe here.

Yuansi Chen’s proved  an almost constant lower bound d^{o(1)} of the isoperimetric coefficient in the KLS conjecture!

This also gives faster mixing time bounds of Markov chain Monte Carlo (MCMC) sampling algorithms on log-concave measures.

Ronen Eldan‘s  stochastic localization scheme and the strategy proposed by Yin Tat Lee and SantoshVempala

Yuansi Chen’s proof relies on Ronen Eldan‘s  stochastic localization scheme that was introduced in this 2013 paper

Ronen Eldan, Thin shell implies spectral gap up to polylog via a stochastic localization scheme

It follows a strategy laid by Yin Tat Lee and Santosh Vempala 2017 FOCS paper

Y. T. Lee and S. S. Vempala,  Eldan’s stochastic localization and the KLS hyperplane conjecture: an improved lower bound for expansion.

There is not much more I can tell you now about Ronen’s theory, about Yin Tat and Santosh’s strategy, about Yuansi’s proof, and about further intermediate results. (This post here might be somewhat related to Ronen’s method.) But let me very briefly mention a few related matters.

  1. There are interesting negative results regarding volume computations, see the 1987 Barany Furedi’s paper and Ronen Eldan’s 2009 paper
  2. It is a very interesting question (that I heard from Moshe Vardi) to understand the connection between practice and theory regarding Markov chain Monte Carlo
    (MCMC) sampling algorithms e.g. for approximating permanents and volumes.
  3. Related topics are: Dvoretzky’s theorem, Milman’s theorem, isotropic position, the concentration of measure phenomenon, asymptotic convex geometry.

The Busseman-Petty problem

The Busseman-Petty problem from 1956 asks whether it is true that a symmetric convex body with larger central hyperplane sections has a larger volume. So the question was  if K and L are centrally symmetric problems and all hyperplane sections of K (through the origin)  have a larger volume than those of L, is it true that the volume of K is larger than that of L.  Busemann and Petty showed that the answer is positive if K is a ball. (A positive answer when L is a ball would have given a very strong version of the slicing conjecture.) But the answer is negative for dimensions larger than or equal to 5.

Here is a brief history:

Larman and Rogers constructed a counterexample in dimension 12 in their example L was a ball and K was a perturbation of a ball. Ball showed that taking L to be a ball and K the standard cube gives a counterexample in dimension 10;  Giannopoulos and Bourgain gave counterexamples in dimension 7   and Papadimitrakis finally found a counterexample in dimension 5. Gardner gave a  positive for answer for dimensions 3  Zhang gave a positive solution for dimension 4. Erwin Lutwak’s theory of intersection bodies plays an important role in the solution.  Gardner, Koldobskiy, and Schlumprecht presented a unified solution in all dimensions.

This is a really beautiful story and was saw a slice of the story in our second “test your intuition”.  In the late 90s I had a “what is new” corner on my homepage and I remember devoting it to some of these developments.

Sources: Gardner’s book “Geometric tomography“, and Koldobskiy’s book “Fourier analysis in convex geometry”.

Here are very nice slides from a lecture of Alexandr Koldobskiy on the problem that also contain some information on the slicing problem.

2016 meeting: Approximate Counting, Markov Chains and Phase Transitions;  Series of 2019 lectures in Atlanta on related matters. ( Koldobsky I,II,III ; Werner; Paouris)

Posted in Combinatorics, Computer Science and Optimization, Convexity, Geometry | Tagged | 2 Comments