Yair Shenfeld and Ramon van Handel Settled (for polytopes) the Equality Cases For The Alexandrov-Fenchel Inequalities

Two weeks ago, I participated (remotely) in the discrete geometry Oberwolfach meeting, and Ramon van Handel gave a beautiful lecture about the equality cases of Alexandrov-Fenchel inequalities which is among the most famous problems in convex geometry.

afi9

In the top left, Ramon explains the AF-inequality and its equality cases for three dimensions. The extremal bodies, also depicted in the top right, remind me of “stacked polytopes.” On the bottom right, you can see the AF-inequality, with the definition of mixed volumes above it. In the bottom left, there’s a discussion on equality cases and why they are challenging.

Following Ramon’s lecture, Igor Pak gave a presentation demonstrating that detecting cases of equality for the Alexandrov-Fenchel inequality is computationally hard. (If it is in the polynomial hierarchy then the hierarchy collapses.) Igor’s blog post (reblogged here) eloquently delves into this issue.

Here is some background. Given convex bodies K_1,K_2,\dots,K_n in \mathbb R^n, the volume of the Minkowski sum t_1K_1+t_2K_2+\cdots +t_nK_n  where t_1,t_2,\dots,t_n\ge 0  is a polynomial in t_1,t_2,\dots, t_n. The mixed volume V(K_1,K_2,\dots,K_n) is defined as the coefficient of the mixed term of this polynomial. 

The Alexanderov-Fenchel inequality asserts that

\displaystyle V(K_1,K_2,\dots,K_n)^2 \ge V(K_1,K_1,K_3,\dots,K_n)V(K_2,K_2,K_3,\dots ,K_n)

And some overview: There exist intricate connections between three areas of convexity. The first concerns metric theory, particularly the study of valuations and the “Brunn-Minkowski” theory. The second involves arithmetic theory, specifically the “Eberhard theory” of counting integer points in dilations of integral polytopes. The third revolves around the combinatorial theory of faces, particularly the study of f-vectors. All these areas are related to algebraic geometry and commutative algebra, particularly to “Hodge theory.”

Here are links to the relevant papers:

Yair Shenfeld and Ramon van Handel, The Extremals of the Alexandrov-Fenchel Inequality for Convex Polytopes.

Abstract: The Alexandrov-Fenchel inequality, a far-reaching generalization of the classical isoperimetric inequality to arbitrary mixed volumes, lies at the heart of convex geometry. The characterization of its extremal bodies is a long-standing open problem that dates back to Alexandrov’s original 1937 paper. The known extremals already form a very rich family, and even the fundamental conjectures on their general structure, due to Schneider, are incomplete. In this paper, we completely settle the extremals of the Alexandrov-Fenchel inequality for convex polytopes. In particular, we show that the extremals arise from the combination of three distinct mechanisms: translation, support, and dimensionality. The characterization of these mechanisms requires the development of a diverse range of techniques that shed new light on the geometry of mixed volumes of nonsmooth convex bodies. Our main result extends further beyond polytopes in a number of ways, including to the setting of quermassintegrals of arbitrary convex bodies. As an application of our main result, we settle a question of Stanley on the extremal behavior of certain log-concave sequences that arise in the combinatorics of partially ordered sets.

Comments: I recall Rolf Schneider presenting (at Oberwolfach in the 1980s) his extensive class of examples for equality cases in A-F inequalities and formulating various conjectures about them. I also remember the relations discovered by Bernard Teissier and Askold Khovanskii between A-F inequalities and algebraic geometry. This connection is discussed in the last section of the paper.

Yair Shenfeld and Ramon van Handel, The Extremals of Minkowski’s Quadratic Inequality

Abstract: In a seminal paper “Volumen und Oberfläche” (1903), Minkowski introduced the basic notion of mixed volumes and the corresponding inequalities that lie at the heart of convex geometry. The fundamental importance of characterizing the extremals of these inequalities was already emphasized by Minkowski himself, but has to date only been resolved in special cases. In this paper, we completely settle the extremals of Minkowski’s quadratic inequality, confirming a conjecture of R. Schneider. Our proof is based on the representation of mixed volumes of arbitrary convex bodies as Dirichlet forms associated to certain highly degenerate elliptic operators. A key ingredient of the proof is a quantitative rigidity property associated to these operators.

Comment: For this special case of the A-F inequality Shenfeld and van Handel gave a complete characterization of the extremal cases. A special case of this special case asserts that 

S^2(K) \ge W(K)V(K) 

where V(K),S(K),W(K) are the volume, surface area, and mean width of K, respectively. 

Swee Hong Chan and Igor Pak, Equality cases of the Alexandrov–Fenchel inequality are not in the polynomial hierarchy

Abstract: Describing the equality conditions of the Alexandrov–Fenchel inequality has been a major open problem for decades. We prove that in the case of convex polytopes, this description is not in the polynomial hierarchy unless the polynomial hierarchy collapses to a finite level. This is the first hardness result for the problem, and is a complexity counterpart of the recent result by Shenfeld and van Handel (arXiv:archive/201104059), which gave a geometric characterization of the equality conditions. The proof involves Stanley’s order polytopes and employs poset theoretic technology.

Igor Pak’s blog post elaborate on this paper.

Three more (among quite a few)  relevant papers: 

Yair Shenfeld and Ramon van Handel, Mixed volumes and the Bochner method; D. Cordero-Erausquin, B. Klartag, Q. Merigot, F. Santambrogio, One more proof of the Alexandrov-Fenchel inequality ; Ramon van Handel,  Shephard’s inequalities, Hodge-Riemann relations, and a conjecture of Fedotov.

Posted in Combinatorics, Convexity, Geometry | Tagged , , , , | 2 Comments

On the Limit of the Linear Programming Bound for Codes and Packing

Samorodnitsky

Alex Samorodnitsky

The most powerful general method for proving upper bounds for the size of error correcting codes and of spherical codes (and sphere packing) is the linear programming method that goes back to Philippe Delsarte. There are very interesting results and questions about the power of this method and in this post I would like to present a few of them, especially results from 2001 and 2004 papers by Alex Samorodnitsky, a 2001 paper by Alexander Barg and David Jaffe, and a 2009 paper by Alex and Michael Navon. Finally I will very briefly mention three recent papers on the limitations of the linear programming method for sphere packings (for dimension \le 32) by  Cohn and Triantafillou, Li, and de Courcy-Ireland,  Dostert, and  Viazovska. I will also mention some other related papers. Two recent posts over here on sphere packing are this post and this post.

Recall that the rate R(C) of a binary code C of length n is defined as \displaystyle \frac {\log |C|}{n}. Let us start with the following central questions:

Question 1: For a real number \alpha, 0 < \alpha < 1/2, what is (asymptotically) the maximum rate f(\alpha) of a binary error-correcting binary code of length n and minimum distance $d=\alpha n$?

Question 2: For a real number \theta < \pi/2 and a spherical d-dimensional code C with minimal distance \theta. What is the maximum value of \displaystyle \frac {\log |C|}{d} as d goes to infinity?

The best known bound upper bound for Question 1 is the McEliece, Rodemich, Rumsey
and Welch (MRRW) bound; The best known upper bound for Question 2 is the Kobatyanski-Levenshtein (KL) bound.

Of course, questions 1 and 2 are important asymptotic special cases of even more general questions regarding upper bounds for the sizes of binary and spherical codes.

We will mainly discuss in this post the following two problems.

Question 3: For a real number \alpha, 0 < \alpha < 1/2, what is (asymptotically) the best upper bound B(\alpha) for  f(\alpha) that follows from Delsarte’s linear program.

Question 4: For a real number \theta <\pi/2, what is (asymptotically) the best upper bound B(\theta) for  f(\theta) that follows from Delsarte’s linear program.

We start with the following paper:

On the optimum of Delsarte’s linear program, A. Samorodnitsky, Journal of Combinatorial Theory, Ser. A, 96, 2, 2001

Abstract: We are interested in the maximal size A(n, d) of a binary error-correcting code of length n and distance d, or, alternatively, in the best packing of balls of radius (d-1)/2 in the n-dimensional Hamming space. The best known lower bound on A(n, d) is due to Gilbert and Varshamov, and is obtained by a covering argument. The best known upper bound is due to McEliece, Rodemich, Rumsey and Welch, and is obtained using Delsarte’s linear programming approach. It is not known, whether this is the best possible bound one can obtain from Delsarte’s linear program. We show that the optimal upper bound obtainable from Delsarte’s linear program will strictly exceed the Gilbert-Varshamov lower bound. In fact, it will be at least as big as the average of the Gilbert-Varshamov bound and the McEliece, Rodemich, Rumsey and Welch upper bound. Similar results hold for constant weight binary codes. The average of the Gilbert-Varshamov bound and the McEliece, Rodemich, Rumsey and Welch upper bound might be the true value of Delsarte’s bound. We provide some evidence for this conjecture

For 0 \le \alpha \le 1 let H(\alpha) denote the entropy function.   

Samorodnitsky Theorem (2001):  B(\delta) \ge (H(1/2 - \sqrt{\delta(1-\delta)}) + (1-H(\delta))/2.

I remember that I was (around 1998) completely fascinated by Alex’s result. This was still in the BB era (before blog) so I did not write about it. As we will see, Alex himself improved the result and thus refuted the conjecture at the end of the abstract. Motivated by Alex’s paper, Alexander Barg and David Jaffe conducted numerical experiments for n=1000. In their paper  Numerical results on the asymptotic rate of binary codes, they wrote: “the most important conclusion is that Delsarte’s linear programming method is unlikely to yield major improvements of the known general upper bounds on the size of codes,” leading to the following conjectures:

Conjecture 6 (supported by numerical computation by Barg and Jaffe): Asymptotically, the MRRW bounds  are the best that can be derived from Delsarte’s linear programming.

Conjecture 7 (by analogy): For spherical codes, asymptotically, the KL bounds are the best that can be derived from the linear programming method.

I find Conjectures 6 and 7 very fascinating and yet perhaps not hopeless.

We note that MRRW themselves (naturally) conjectured that Delsarte’s LP can lead to better bounds than their bounds.

In 2004 Alex proved an analogous result for his result about binary codes to the case of spherical codes. (The paper also studies spherical designs.) The best known upper bounds by Grigory Kabatiansky and Vladimir Levenshtein are based on Delsarte’s LP. The best known lower bounds are achieved by greedy spherical codes (or random spherical codes) and only slight improvements for them are known. (See this post; these small improvements have no bearing on the rate.) 

On linear programming bounds for spherical codes and designs, A. Samorodnitsky, Discrete and Computational Geometry, 31, 3, 2004.

Abstract: We investigate universal bounds on spherical codes and spherical designs that could be obtained using Delsarte’s linear programming methods. We give a lower estimate for the LP upper bound on codes, and an upper estimate for the LP lower bound on designs. Specifically, when the distance of the code is fixed and the dimension goes to infinity, the LP upper bound on codes is at least as large as the average of the best known upper and lower bounds. When the dimension n of the design is fixed, and the strength k goes to infinity, the LP bound on designs turns out, in conjunction with known lower bounds, to be proportional to k^{n-1}

The next paper I want to discuss is Linear programming bounds for codes via a covering argument, M. Navon, A. Samorodnitsky, Discrete and Computational Geometry, 41, 2, 2009.

This paper contains results in various directions, but related to the topic of our post it improves Alex 2001 bound about the limitation of Delsarte’s LP:

Navon and Samorodnitsky Theorem (2009):

B(\delta) \ge 1/2 H(1 - 2\sqrt{\delta(1-\delta)}).

This gives better bounds for large values of \delta and is the current world record for B(\delta).

Let me mention three recent papers about the limitations of  Cohn-Elkies’ linear program for sphere packing. The first is: Henry Cohn, Nicholas Triantafillou, Dual linear programming bounds for sphere packing via modular forms, the second is by Rupert Li, Dual Linear Programming Bounds for Sphere Packing via Discrete Reductions, and the third is by  Matthew de Courcy-Ireland, Maria Dostert, and Maryna Viazovska, Six-dimensional sphere packing and linear programming.

Some comments:

1) MRRW1 and MRRW2. MRRW beautifully derived their bound (H(1/2 - \sqrt{\delta(1-\delta)}) (referred to as MRRW1) from the LP programs, went on to apply Delsarte’s linear program  for constant weight codes and obtained similar upper bounds for them. Then they used these bounds to improve MRRW1 to stronger bounds (referred to as MRRW2) for some values of \delta. As it recently turned out  MRRW2 could also be derived directly from Delsarte’s LP for the Hamming cube.

2) The Fourier connections. There is a nice way to get the Delsarte’s linear programming by Fourier analysis of Boolean functions. This is explained in a 1995 paper by Nati Linial and me On the distance distribution of binary codes.

3) Various proofs for MRRW1.  Although the original proof of MRRW1 (and MRRW2) are not very complicated new proofs shed light on the theorem. Let me mention a few papers: Friedman and Tillich’s ingenious paper, Generalized Alon–Boppana theorems and error-correcting codes; The paper by Navon and Samorodnitsky mentioned above; A recent paper by Samorodnitsky, One more proof of the first linear programming bound for binary codes and two conjectures; A recent paper by Linial and Elyassaf Loyfer, An Elementary Proof of the First LP Bound on the Rate of Binary Codes.

4) What will it take to improve MRRW or KL? This is, of course, a holy grail in the area, and a central problem in mathematics. Over the years there were several interesting proposals on how to improve Delsarte’s LP and improve the MRRW and KL bounds. So far, there have been no improvements of the bounds themselves. This is a topic for a whole new post.

5) Can hypercontractivity help? A suggestion from my old paper with Nati was to add to Delsarte’s LP  hypercontractive inequalities. It will be nice to pursue this direction. Alex’s conjectures in the paper mentioned in comment 3 are related. Alex told me that hypercontractive inequalities (and discrete isoperimetric inequalities) are also useful for understanding the limitations of Delsarte’s LP.

Two additional pictures

fbds

From a Facebook discussion about the breakthrough of Marcelo Campos, Matthew Jenssen, Marcus Michelen, and Julian Sahasrabudhe. 

hcnt

Links to various relevant papers. (Please comment on additional relevant papers especially on the central question of the limitations of Delsarte’s LP.)

  1. R McEliece, E Rodemich, H Rumsey, L Welch New upper bounds on the rate of a code via the Delsarte-MacWilliams inequalities,  IEEE Transactions on Information Theory 23 (1977)
  2. G.A. Kabatyanskii and V. I. Levenshtein, Bounds for  packings on a sphere and in space, Probl. Inform.Transm.,14 (1978).
  3. Philippe Delsarteand and Vladimir I. Levenshtein,  Association Schemes and Coding Theor, IEEE Transactions on Information Theory. 44 (1998)
  4. Gil Kalai and Nati Linial, On the distance distribution of binary codes., IEEE Trans. Inf. Th., 41(1995).
  5. A. Samorodnitsky, On the optimum of Delsarte’s linear program, Journal of Combinatorial Theory, Ser. A, 96, 2, 2001
  6. Linear codes and character sums, N. Linial and A. Samorodnitsky, Combinatorica 22 (2002), no. 4, 497–522.
  7. Alexander Barg and David Jaffe, Numerical results on the asymptotic rate of binary codes,
  8. Alexei Ashikhmin, Alexander Barg, and Simon Litsyn, Estimates_of_the_distance_distribution_of_codes_and_designs, IEEE Transactions on Information Theory, 47 (2001).
  9.  Joel Friedman and Jean-Pierre Tillich, Generalized Alon–Boppana theorems and error-correcting codes, SIAM J. Discret. Math. 19(2005), 700-718. 
  10. A. Schrijver. New code upper bounds from the Terwilliger algebra and semidefinite
    programming. IEEE Transactions on Information Theory, 51 (2005) 2859–2866.
  11. M. Navon, A. Samorodnitsky,  Linear programming bounds for codes via a covering argumentDiscrete and Computational Geometry, 41 (2009).
  12. Alex Samorodnitsky,  A modified logarithmic Sobolev inequality for the Hamming cube and some applications,  arXiv:0807.1679.

  13. Henry Cohn, Nicholas Triantafillou, Dual linear programming bounds for sphere packing via modular forms
  14. On the weight distribution of random binary linear codes, Nati Linial and Jonathan Mosheiff, Random Strutures and Algorithms, 56(1): 5-36 (2020).
  15. Leonardo N. Coregliano, Fernando G. Jeronimo, Chris Jones, A Complete Linear Programming Hierarchy for Linear Codes, arXiv:2112.09221
  16. Rupert Li, Dual Linear Programming Bounds for Sphere Packing via Discrete Reductions,  arXiv 2206.09876
  17. Alex Samorodnitsky, Hypercontractive Inequalities for the Second Norm of Highly Concentrated Functions, and Mrs. Gerber’s-Type Inequalities for the Second Rényi Entropy,
  18. Matthew de Courcy-Ireland, Maria Dostert, and Maryna Viazovska, Six-dimensional sphere packing and linear programming.
  19. Alex Samorodnitsky, One more proof of the first linear programming bound for binary codes and two conjectures, Israel Journal of Mathematics
  20. Nati Linial and Elyassaf Loyfer, New LP-based upper bounds in the rate-vs.-distance problem for linear codes, arXiv 2206.09211.
  21. Nati Linial and Elyassaf Loyfer, An elementary proof of the first LP bound on the rate of binary codes. arXiv 2303.16619.
  22. Steven Dougherty, Jon-Lark Kim, and Patrick Sole, Open_Problems_in_Coding_Theory, 2015.
Posted in Combinatorics, Convexity, Geometry | Tagged , , , | 1 Comment

TYI 54: A Variant of Elchanan Mossel’s Amazing Dice Paradox

The following question was inspired by recent comments to the post on Elchanan Mossel’s amazing Dice Paradox.

A fair dice is a dice that when thrown you get each of the six possibilities with probability 1/6. A random dice is such that you get the outcome ‘k‘ with probability p_k where (p_1,p_2, \dots ,p_6) is a vector chosen uniformly at random from the 5-dimensional simplex.

Test your intuition:

You throw a random dice until you get 6. What is the expected number of throws (including the throw giving 6) conditioned on the event that all throws gave even numbers?

rd
Addition: Test your intuition also on the following simpler question. We have a coin that gives head with probability p and tail with probability 1-p, and p is itself chosen (once and for all) uniformly at random in [0,1]. What is the expected number of coin tosses until we get head.

Addition:  Note that there are two different ways to understand the question itself. One way is to regard the conditioning as applying both to the dice throwing and to the random choice of the probabilities (p_1,p_2,\dots ,p_6). A second way is to consider the parameters (p_1,p_2,\dots,p_6) as being determined at random uniformly; and then taking the average over (p_1,p_2,\dots,p_6) for the expected number of throws conditioned on the event that all throws gave even numbers.

Posted in Probability, Test your intuition | Tagged , | 14 Comments

Riddles (Stumpers), Psychology, and AI.

Two months ago I presented five riddles and here, in this post, you will find a few more. (These types of riddles are called by Maya Bar-Hillel “stumpers”.) This time, we will have a discussion about how riddles are related to psychology and how this relation might be connected to (or even explored by) AI. Let’s start with three new riddles.

Three new riddles

Cecil is a criminal lawyer. Many years ago, he spent a couple of days in hospital. When he was discharged, he was in perfect health. Indeed, there was nothing at all wrong with him. Nonetheless, he was incapable of leaving his hospital bed on his own and had to be carried off of it. Please explain.

Explain how it is possible for Christmas and New Year’s to fall in the same year.

A ping pong ball was hit. It flew in the air, stopped, reversed direction, and returned to where it originated. The ball was not attached to anything, nor bounced off anything. Explain in a few words how that is possible.

GPT3 and ChatGPT are facing the ping-pong riddle

pingpong2

Note that the first two sentences by GPT3 are a fine solution to the riddle. But the third sentences spoils it: a) The reversal had already occured at the apex. b) hitting the ground violate the stipulation that the ball did not bounce off anything.

pingpong4

Here, ChatGPT (Dec. 2023) clearly failed. This specific answer is not similar to any human response we have collected.

Riddles and Psychology

There is a long tradition of using riddles to study human thinking and problem solving. The riddles Maya calls “stumpers” are closely related to riddles called “one-step insight problems” by Aisling Murray and Ruth Byrne.

When a listener hears a stumper she forms a (typically visual) mental representation of the narrative, from which it is not possible to see the solution. However, there exist other representations, consistent with the narrative, in which the solution is immediate. 

This post is based on conversations with Maya as well as on three papers:

Maya Bar-Hillel, Tom Noah and Shane Frederick, Learning psychology from riddles: The case of stumpers, Judgment & Decision Making, 13 (2018), 112-122. (We may also refer to the authors by the acronym BNF or by their first names Maya, Tom, and Shane.)

Maya Bar-Hillel, Tom Noah and Shane Frederick, Solving stumpers, CRT and CRAT: Are the abilities related? Judgment & Decision Making , 14 (2019) , 620 – 623.

Maya Bar Hillel,  An annotated compendium of stumpers, Thinking and Reasoning, 27 (2021),  536–566.

Outline 

Following Bar-Hillel (2021), I will start with the work of Eleanor Rosch to explain people’s reliance on default mental representations that hinder them in solving certain stumpers;  I will then mention the work of Paul Grice to explain why people fail to solve other stumpers that actually require default mental representations to solve the stumpers.  Maya calls these two categories “Roschian stumpers” and “Gricean stumpers,” respectively. Next, I will tell you about my attempt with Maya to solve stumpers using AI generative programs, and we will see a lot of entertaining examples along the way. Specifically, we gathered some data on how AI tries to solve these riddles or to represent relevant fragments thereof. In this context, we pondered questions such as: Can an understanding of human psychology be gleaned from experimenting with AI? Alternatively, do AI systems exhibit distinctive “psychological” characteristics of their own?

Maya and I had several fruitful discussions with Daphna Shahaf from The Hebrew University and I had interesting conversations with Ohad Fried and Arik Shamir from Reichman University. By the way, Maya’s father, Yehoshua Bar-Hillel, Arik’s father (and my dear friend) Eli Shamir, and my “academic father” (and also dear friend) Micha A. Perles, wrote together a famous paper about automata On formal properties of simple phrase structure grammars. Z. Phonetik, Sprachwissen. Kommun., 14(1961), 143–172.

GPT3 faces the Christmas-New Year’s riddle

Christmas1

Here GPT3 had difficulties

Eleanor Rosch and Roschian stumpers

Let’s go over some of the riddles from the previous post. Identifying the default mental representation evoked by a riddle is related to Rosch’s work. To put it briefly, Rosch argues that people’s representations of natural language categories are based on the category prototypes rather than on formal definitions. At the end of the post, you will find some more details on Rosch’s theory.

Successful Roschian stumpers are those whose default representation is so prevalent and durable, that most respondents remain locked into it, and fail to solve the stumper, and here are three examples.

Two men play five checkers games. Each wins an even number of games, with no ties. How is this possible?

The overwhelming visual representation is that of two players playing with each other.  The visual image of two players playing checkers against each other is more economical than the image of two players playing checkers against two other players. This is an example of a general principle of “cognitive economy, ” namely preferring the representation requiring  least cognitive effort.

Individual bus rides cost one dollar each. A card good for five rides costs five dollars. A first-time passenger boards the bus alone and hands the driver five dollars, without saying a word. Yet the driver immediately realizes, for sure,that the passenger wants the card, rather than a single ride and change. How is that possible?

BNF hypothesized that “it is mentally easier to conjure a representation of a single item than of multiple items. Thus, a five dollar bill is easier to ‘see’ than five singles”.

In our market, a small potato bag costs 5 taka, a medium potato bag costs 7 taka, and a large potato bag costs 9 taka. However, a single potato costs 15 taka. Please explain.

The default representation of a potato bag is that of a bag filled with potatoes rather than an empty bag, and the explanation is that  although the purpose of containers (bottles, boxes, jugs, bags) is to contain, BNF hypothesized that they (bottle, box, jug) would be represented as empty unless referencing specific contents (e.g., wine bottle, cereal box, milk jug).

Paul Grice and Gricean Stumpers

Gricean stumpers rely on destroying the default representation. Respondents are steered away to an alternative representation by the certain words in the narrative, whereas the solution actually lies in the default.

For example, consider the following riddle:

A big brown cow is lying down in the middle of a country road. The streetlights are not on, the moon and stars are not out, and the skies are heavily clouded. A truck is driving towards the cow at full speed, its headlights off. Yet the driver sees the cow from afar easily and stops in time. How is that possible?

The default visual representation of any generic scene, in particular, for a “a big brown cow is lying down in the middle of a country road,” is that it happens in daylight. However, adding that the streetlights are not on, etc. modifies the default representation. The reason yields another important insight into psychology that was studied by Paul Grice. As BNF explain: “One of Grice’s (1975) conversational maxims is “Do not make your contribution more informative than is required” (p. 45). Hence, when the narrative bothers to mention the absence of light sources, darkness is evoked (since those light sources are otherwise irrelevant). 

At the end of the post, you will find some more details on  Grice’s paper “Logic and Conversation.”

Enters AI generative programs

Together with Maya, we asked ourselves some questions about AI generative programs and gathered some data on the results of letting AI current at the time try to solve these riddles or to represent relevant fragments of the riddles. Since we did not know much about AI it was a good opportunity to learn about them. Maya was especially interested in visual representations and therefore was especially interested in Dall-E 2 experiments. Most of our experiments with AI took place in the summer and fall of 2022. We used Dalle E 2, GPT3, and chatGPT.  (These AI programs were rapidly advancing, and GPT4 had not yet been released.) 

My main conclusion from many experiments was that, overall, AI responses showed that computers conjure the same default representations that cause humans to be stumped by riddles. That is, the AI followed the default mental representations that BNF identified, and the AI responses also (sometimes, explicitly) followed Grice’s rules. Maya’s interpretation was different. She thought that the data is a big mess: sometimes the AI were stumped in the same way as humans, and sometimes not. Moreover, Maya’s impression was that often the AI were not advanced enough even to tell.

I sort of like Maya’s interpretation because I find self-critical, cautious, and non-self-promoting interpretations endearing; I even tend to agree with her that some of the data is a big mess. However, I also think that nontheless the data clearly showed that AI manifest similar “behavior” to humans.

There were several points Maya and I agreed upon. We both agree that some inherent weaknesses of (otherwise highly impressive) AI affect the ability to draw conclusions in our case. (I even thought that the AI responses could sometimes be seen as an exaggerated form or even a caricature of human responses.) We also agree that it is difficult to conduct a quantitative analysis similar to the analysis of experiments with many individuals. Maya also mentioned that her impression from a recent conference was that many researchers are trying to conduct research with AI using their favorite tasks.

Selected AI Experiments

1. Falling from a 60-foot tall ladder

Pedro was washing windows outside a tall office building when he fell off his 60-foot ladder onto the ground below. Incredibly, he did not injure himself in any way. He was not wearing any safety equipment, and nothing broke his fall. Explain in a few words how he was not injured.

(Test your intuition: is this a Roschian stumper or a Gricean stumper)

GPT-3’s response (Sept. 2022) to the full riddle

Pedro was not injured because he landed in a pile of feathers. The feathers cushioned his fall and prevented him from being injured.

We let Dalle-E2 make a drawing prompted by “Nancy fell from a 60-feet ladder by Goya”

60feet

All AI programs that we tried (just as humans) located Pedro high up on his ladder.

And here is ChatGPT (Jan. 2023)

60feetchatgpt

Although the chatGPT’s response looks a little funny, in a sense, it represents quite a good reaction to the prompt. The prompt (as commonly understood by humans and apparently also as “understood” by chatGPT) describes an unlikely scenario that chatGPT politely disbelieved. (Using the word “incredibly” in the riddle supports this interpretation.)

2. Sharing a pizza

(GPT) Six Italians are sharing a Pizza. The older Italian is the father of the younger Italians. Try to Guess the names of the Italians.

  1. Paolo
  2. Gino
  3. Franco
  4. Angelo
  5. Marco

For each one, say if it is a man or a woman.

  1. Men [sic.]
  2. Men
  3. Men
  4. Men
  5. Men

Why did you choose them all to be men?

There is no specific reason, it is just a guess.

Remark: we noticed is that AI programs tend to regard all members of a group of people to be of the same gender. Therefore if one member is marked as a man or as a women AI will tend to mark all of them as such. This seems to express some form of “cognitive economy”; we do not know if this is similar to human behavior.

Prompt: Three Italians are sharing a pizza. The older Italian is the mother of the younger Italians

Here is what Dalle-E2 produced to this prompt

pizza3W

3.  Facing opposite directions 

Two old friends are standing together on a hilltop, in complete silence. Nora is facing north, and Sara is facing south. Suddenly Nora asks: “Why are you smiling, Sara?” Explain briefly how Nora saw Sara’s smile.

Here is a screenshot of the first hits in Google Images for “two people face opposite directions.”

opposite

In most images the people do not see each other. 

norasara

4. Playing Checkers

Two men play five checker games. Each wins an even number of games, with no ties. How is this possible?

checkers

5. Eggs for breakfast

Farmer Joe eats two fresh eggs from his own farm for breakfast every day. Yet there are no chickens on his farm. 

We told chatGPT the story about farmer Joe and asked it to elaborate. This was the response:

Farmer Joe was a hardworking man who lived on a sprawling farm in the countryside. Every day, he would wake up early and work tirelessly to tend to his crops and animals. He was always up before the sun, and one of the first things he did each morning was to prepare himself a delicious breakfast.

For breakfast, Farmer Joe would always eat two fresh eggs from his own farm. They were his favorite food, and he looked forward to them every day. Despite the fact that there were no chickens on his farm, the eggs were always there for him in the morning, as if by magic.

At first, Farmer Joe thought nothing of this strange occurrence. He simply assumed that one of his neighbors had given him the eggs as a gift. However, as the weeks went by and the eggs continued to mysteriously appear on his kitchen counter each morning, he began to wonder how it was possible. He asked around and no one in the area had any chickens.

One day, Farmer Joe decided to investigate. He scoured the entire farm, searching for any signs of chickens or eggs, but he found nothing. He even checked with his neighbors, but they had no explanation either.

In the end, Farmer Joe was left with a mystery that he could not solve. How could he be eating two fresh eggs every day if there were no chickens on his farm? The eggs seemed to be appearing out of thin air, and no one could explain how it was possible.

Despite the mystery, Farmer Joe continued to eat his beloved eggs every morning, grateful for their delicious taste and the memories they brought

Prompt: But now write a new version where there is no mystery left. The eggs are not from anybody else but from the farm and at the same time there are no chickens in the farm.

Farmer Joe runs a successful and innovative farm that specializes in producing high-quality eggs. Despite not having any chickens on the property, he has found a way to produce fresh eggs for his daily breakfast using advanced methods of egg cultivation.

Every morning, Farmer Joe enjoys a hearty breakfast consisting of two perfectly cooked, fresh eggs from his farm. He takes great pride in the quality of his eggs and the methods used to produce them.

At first, it may seem like a mystery as to how Farmer Joe is able to have eggs for breakfast every day without any chickens on the farm. However, with the advancements in agriculture and food production, there are now methods of cultivating eggs without the need for traditional egg-laying hens. This technology allows for the production of fresh eggs in a controlled environment, ensuring that they are of the highest quality.

So, there is no mystery left, just a dedicated farmer who is taking advantage of the latest in egg production technology to provide himself with delicious, fresh eggs every day.

Prompt: Now tell a similar story, also with no mystery being left that occurred in 1927.

In 1927, Farmer Joe lived on a large farm in the countryside. Despite not having any chickens on the property, he was still able to enjoy fresh eggs for breakfast every day. This was because he had taken a unique approach to raising his own poultry.

Instead of having chickens, Farmer Joe had decided to raise a breed of domesticated ducks that laid high-quality, nutritious eggs. These ducks

were well-cared for and lived in spacious coops that provided them with plenty of room to roam and plenty of fresh water to swim in.

Every day, Farmer Joe would collect a basket of fresh eggs from the coops and bring them into the kitchen to cook for his breakfast. He took great pride in the taste and quality of the eggs, and would often invite friends and neighbors over to sample them for themselves.

Despite the unusual approach to raising poultry, Farmer Joe’s farm was thriving and his eggs were in high demand. He became known as the “Duck Egg Farmer” and was celebrated for his innovative and sustainable farming practices.

By embracing an alternative method of raising poultry, Farmer Joe was able to provide himself and others with a delicious and healthy source of protein, all while avoiding the traditional challenges and expenses associated with raising chickens

Conclusion (for now)

Try for yourself!

More on Grice’s paper “Logic and Conversation” 

 “Logic and Conversation” is a seminal paper in the field of philosophy of language and pragmatics by H. P. Grice, originally published in 1975. The paper is part of a broader exploration of how speakers convey meaning and communicate in natural language, going beyond the literal meaning of words and sentences. Grice introduces the concept of implicature, which is a way of understanding what is communicated beyond the explicit content of an utterance and the logical inferences therefrom.
 
In the paper, Grice discusses the Cooperative Principle, which is the assumption that speakers and listeners cooperate to communicate effectively. He identifies four maxims of conversation that underlie this principle: the Maxim of Quality, Quantity, Relation, and Manner. Grice argues that these maxims guide conversation by informing speakers and listeners about how to convey and interpret meaning. These maxims afford implicatures, which are the additional implied meanings in a conversation.
 
Grice’s paper emphasizes the importance of conversational implicature in understanding how people communicate and infer meaning in everyday language use. He also distinguishes between implicatures that arise from the maxims and implicatures that are calculable and derived through reasoning. This distinction has been influential in the study of pragmatics and has had a lasting impact on linguistic and philosophical discussions of language and communication.
 
Overall, “Logic and Conversation” by H. P. Grice is a foundational work that has significantly contributed to the development of the theory of implicature and the understanding of how language is used in real-world conversations. It has had a profound influence on the study of pragmatics and continues to be a central reference in the field of linguistics and philosophy.

(Source: ChatGPT with two little corrections by Maya.) 

P. Grice,   Logic and Conversation. In Syntax and Semantics, 3(1975), 41- 58. 

More on Elinor Rosch

Eleanor Rosch, a prominent cognitive psychologist, is best known for her groundbreaking work in the field of categorization and prototype theory. Rosch developed the prototype theory of categorization, challenging traditional views of categorization as being based on defining features. Rosch proposed that people organize objects and concepts into categories based on prototypes – typical or representative examples that embody the most salient features of a category.

Her influential research, particularly the seminal paper “Natural Categories,” published in 1973, demonstrated that people tend to categorize objects based on family resemblances rather than strict definitions. Rosch’s work has had a profound impact on cognitive science, linguistics, and philosophy, influencing how researchers and scholars understand the cognitive processes underlying human conceptualization and categorization. Additionally, her contributions have influenced fields such as artificial intelligence and have paved the way for a more nuanced understanding of how individuals structure their mental representations of the world.

And here we discuss Ludwig Wittgenstein’s influence on Rosch:

Eleanor Rosch’s work, particularly her development of prototype theory and her contributions to the study of categorization, has been influenced by the ideas of Ludwig Wittgenstein, a philosopher known for his work in the philosophy of language and the philosophy of mind. Wittgenstein’s later philosophy, especially as expressed in his posthumously published work “Philosophical Investigations,” had a significant impact on Rosch’s thinking.

Wittgenstein’s ideas challenged traditional views of language and meaning, emphasizing the importance of language use in specific contexts rather than relying solely on fixed definitions. His concept of “family resemblances,” where members of a category share overlapping features rather than possessing a set of defining characteristics, has been particularly influential. Rosch applied and extended these notions to the realm of cognitive psychology, proposing that people categorize objects based on prototypes or typical examples that capture the family resemblance of a category. This departure from a strict, rule-based definition of categories marked a shift in how researchers understand human cognition and categorization, with Wittgenstein’s ideas serving as a philosophical foundation for Rosch’s empirical work.

(Sources: ChatGPT)

E. Rosch, Natural Categories, Cognitive psychology, 4 (1973), 328-350.

E. Rosch,  Principles of categorisation. In E. Rosch & B. B. Floyd (Eds.),Cognition and categorisation, pp. 27–48. Lawrence Erlbaum Associates, 1978.

Finally, here is Cecil the criminal lawyer at the age of two (Dalle-E2)

DALL·E 2023-02-07 20.51.25 - Cecil is a criminal lawyer. Show a picture of Cecil at age of two.

And as a bonus, Dalle-E2’s view of Karl Marx at the age of two

DALL·E 2023-12-27 22.34.56 - Show a picture of Karl Marx at the age of two.

Posted in AI, Computer Science and Optimization, Philosophy, Psychology | Tagged , , , | 9 Comments

Soma Villanyi: Every d(d+1)-connected graph is globally rigid in d dimensions.

rigidity3

Today, I want to tell you a little about the following paper that solves a conjecture of Laszlo Lovász and Yechiam Yemini from 1982 and an even stronger conjecture of Bob Connelly, Tibor Jordán, and Walter Whiteley from 2013:

Every d(d+1)-connected graph is globally rigid in \mathbb R^d, by Soma Villányi

Here is the abstract (h/t to Nati Linial for telling me about the paper):

Using a probabilistic method, we prove that d(d+1)-connected graphs are rigid in \mathbb R^d, a conjecture of Lovász and Yemini. Then, using recent results on weakly globally linked pairs, we modify our argument to prove that d(d+1)-connected graphs are globally rigid, too, a conjecture of Connelly, Jordán and Whiteley. The constant d(d+1) is best possible. 

Let G be an abstract graph with n vertices. A d-dimensional framework is an embedding of the vertices of G into \mathbb R^d.  The embedding is flexible if there is a non-trivial flex, i.e., a non-trivial perturbation of the embedded vertices that keeps all the edge lengths fixed. Here, a trivial flex means a flex that comes from a rigid motion of the entire space. In other words, a trivial flex keeps the distances between every pair of embedded vertices fixed. An embedding is rigid if it is not flexible. There is a related linear notion of infinitesimal flex – an assignment of velocity vectors to the vertices so that the edge length is constant to the first degree. An embedding is globally rigid if every other embedding with the same distances between adjacent vertices is obtained by a rigid motion of \mathbb R^d.

A graph G is generically d-rigid or simply d-rigid if it is rigid for a generic embedding into \mathbb R^d. (This is equivalent to being generically infinitesimally rigid.) A graph is globally d-rigid if it is globaly rigid for a generic embedding into \mathbb R^d.

Being d-rigid is a mysterious property of graphs and being globally d-rigid is an even more mysterious property. The edges of minimal d-rigid subgraphs of the complete graph on n vertices are the basis of the d-rigidity matroid. A graph is 1-rigid iff it is connected so we can think about d-rigidity as a strong form of connectedness. Another extension of graph connectivity is the notion of k-connected graph, namely a graph that remains connected when we delete every set of at most (k-1) vertices. Lovasz and Yemini proved in 1982 that every 6-connected graph is 2-rigid, and made the conjecture that Villanyi has just proved. Jackson and Jordán proved in 2005 that 6-connected graphs are globally 2-rigid.

Congratulations, Soma Villanyi, and happy and better new year to all

Posted in Combinatorics, Geometry | Tagged , , , , | 1 Comment

Fun with Algorithms (FUN 2024)

FUN24logo

Fun with Algorithms 

FUN is a series of conferences dedicated to the use, design, and analysis of algorithms and data structures, focusing on results that are “fun”  but also original and scientifically solid. “Fun”  can be defined in many  ways:  we think of fun results as being amusing or entertaining by their display of  elegance, simplicity, surprise, originality, or wit.  The topics of interest include all aspects of algorithm design and analysis under any computing  model. Participating in FUN is not something you are likely to regret. Tami Tamir is the PC chair of this year’s conference (FUN 2024). Here is the link for more details on the conference itself and the whole series of conferences. The conference will take place on June 4 – 8, 2024 in Island of La Maddalena, Sardinia, Italy. The conference is going to have, as a special track chaired by Andrei Broder, a CS Salon des Refusés – dedicated to important results in CS theory that were rejected in their early attempt to be published.

Here is the conference’s call for papers. The deadline is February 20, 2024. The very same deadline applies to submissions for Salon de Refusés.

&&&&&&&&&&&&&&&&&&

Our algorithm corner

About a decade ago we made here the following announcement: “while chatting with Yuval Rabani and Daniel Spielman I realized that there are various exciting things happening in algorithms (and not reported so much in blogs). Much progress has been made on basic questions: TSP, Bin Packing, flows & bipartite matching, market equilibria, and k-servers, to mention a few, and also new directions and methods. I am happy to announce that Yuval kindly agreed to write here an algorithmic column from time to time, and Daniel is considering contributing a guest post as well.” We are not in a hurry of any kind but I will try to check if the algorithm column idea could be revived.

Recent posts here on the blog

Here are links to our five most recent posts with very interesting mathematical news:

1) Marcelo Campos, Matthew Jenssen, Marcus Michelen and, and Julian Sahasrabudhe: Striking new Lower Bounds for Sphere Packing in High Dimensions;

2) On Viazovska’s modular form inequalities by Dan Romik;

3) Oberwolfach Workshop: Geometric Algebraic and Topological Combinatorics 2023, part I;

4) Progress Around Borsuk’s Problem

5) Marton’s “Polynomial Freiman-Ruzsa” Conjecture was Settled by Tim Gowers, Ben Green, Freddie Manners and Terry Tao

More coming soon.

Posted in Computer Science and Optimization, Conferences | 1 Comment

Marcelo Campos, Matthew Jenssen, Marcus Michelen and, and Julian Sahasrabudhe: Striking new Lower Bounds for Sphere Packing in High Dimensions

A few days ago, a new striking paper appeared on the arXiv

A new lower bound for sphere packing

by Marcelo Campos, Matthew Jenssen, Marcus Michelen, and Julian Sahasrabudhe

Here is the abstract:

We show there exists a packing of identical spheres in \mathbb R^d with density at least \displaystyle (1-o(1))\frac{d \log d}{2^{d+1}}\, as d\to\infty This improves upon previous bounds for general d by a factor of order \log d  and is the first asymptotically growing improvement to Rogers’ bound from 1947.

Let me tell you a little about this amazing result. Congratulations, Marcelo, Matthew, Marcus, and Julian!

In unrelated news: Congratulations to József Balogh, Robert Morris, Wojciech Samotij, David Saxton and Andrew Thomason for being awarded the Steele Prize for seminal contribution to research.

I am thankful to Zachary Chase for telling me about this development.

A brief history

Earlier lower bounds: Every saturated sphere packing (namely when you cannot squeeze in an additional sphere) have density 2^{-d}. (This is because when you double all spheres you will get a covering of space.) In 1905, this was improved by a factor of 2 + o(1) by Minkowski. The first asymptotic improvement was by Rogers in 1947 and he gave a lower bound of \displaystyle 2^{-d} d c(1+o(1)). Roger’s value for the constant c was c=2/e, and this value was improved to 1.68, 2, 6/e (when d is divisible by 4), and then by Venkatesh to c=65963 (when d is large; I wonder, how large?). Venkatesh also proved a lower bound of  the form 2^{-d} d \log \log d along a sparse sequence of dimensions. Here is the link to Venkatesh’s striking paper from  a decade ago. The best upper bound going back to  Kabatjanskii and Levenstein is of the form 2^{-(.599+o(1))d}.

A theorem about graphs

One interesting aspect of the new paper is the use of combinatorial methods and of graph-theory. The following theorem is closely related to results of Ajtai-Komlos-Szemeredi and of Shearer. For a graph G, \Delta (G) denote the maximum degree of a vertex in G and \Delta_2 (G) denote the maximum codegree, namely,  the maximum number of common neighbours a pair of distinct vertices in G can have. (\alpha(G) is the independence number of G.)

Theorem: Let G be a graph on n vertices, such that \Delta(G)\le \Delta  and \Delta_2(G)\le C \Delta (\log \Delta)^{-c}. Then

\displaystyle \alpha (G) \ge (1-o(1))\frac {n \log \Delta}{\Delta},

where o(1) tends to 0 as \Delta \to \infty and we can take C=2^{-7} and c=7.

This theorem (of much independent interest) is closely related to a theorem of Shearer for triangle free graphs, and an earlier theorem of Ajtai-Komlos-Szemeredi on Ramsey numbers r(3,n). The technique of proof is known as the Rodl nibble (or the semi-random method). Moving from Shearer-type theorems to results about sphere packing was pioneered twenty years ago by Krivelevich, Litsyn, and Vardy. (Their paper reproved a \displaystyle 2^{-d} d c(1+o(1)) lower bound using Shearer’s theorem.)

The graph G(X,r)

If X is a subset of \mathbb R^d the graph G(X,r) has the set X as its set of vertices and two vertices are adjacent if their distance is at mosr 2r. The proof of the new sphere packing bound is based on discretization step, where you start with a huge ball \Omega and find a finite set of point X \subset \Omega such that G(X,r) has appropriate degrees and codegrees. The discretization step allows applying the graph theoretic theorem directly and to achieve a sphere packing based on the large independent set, guaranteed by the theorem, that G(X,r) contains.

Connection to physics and the replica method

The paper mentions the notion of amorphous sphere packings in physics and a prediction of Parisi and Zamponi based on the “replica method” that the largest density of “amorphous sphere packings” is (1+o(1)d \log d2^{-d}. This is twice the value achieved in the paper, and the authors even predict that their graph theoretic theorem (and hence their sphere packing bound) can be improved by a factor 2. Roughly speaking,  amorphous sphere packings have no long-term correlations.

Spherical codes and kissing numbers.

The paper contains similar constructions for spherical codes, and, in particular, for kissing numbers, and it added to recent improvements on these questions. A spherical code is a collection of points on the unit sphere S^{d-1} of pairwise angle at least \theta. Denote by A(d,\theta) the maximal cardinality of such a spherical code. Let s_d(\theta) be the surface area of a spherical cup of radius \theta normalized by the surface area of the entire sphere S^{d-1}. (We assume that \theta < \pi/2.) Only recently, Jenssen, Joos, and Perkins improved the easy bound of 1/s_d(\theta) (just look at a saturated family of spherical caps of radius \theta/2) by a linear factor in the dimension. The new paper adds an additional \log d factor.

KLS

Update: Interesting comments over Facebook.

Posted in Combinatorics, Convexity, Geometry | Tagged , , , , , | 5 Comments

On Viazovska’s modular form inequalities by Dan Romik

The main purpose of this post is to tell you about a recent paper by Dan Romik which gives a direct proof of two crucial inequalities in Maryna Viazovska’s proof that E_8 lattice sphere packing is the densest sphere packing in eight dimensions.

Update: See the comment section for another striking progress on the sphere packing problem, this time in high (tending to infinity) dimensions.

On Viazovska’s modular form inequalities, by Dan Romik

Abstract

Viazovska proved that the sphere packing associated with the E_8 lattice, which has a packing density of \frac {\pi^4}{384}, is the densest sphere packing in 8 dimensions. Her proof relies on two inequalities between functions defined in terms of modular and quasimodular forms. We give a direct proof of these inequalities that does not rely on computer calculations.

This is wonderful news. Viazovska’s breakthrough was among the startling mathematical news of the past decade and I am always very excited about different proofs for known theorems, new proofs, simplified proofs, and at times even new complicated proofs and complifications.

Here is my original post on Viazovska’s proof (with many links) and a post by John Baez. And here are slides of a recent talk by Dan on his work.

I will say more about this paper below. First let me mention some related and unrelated things that you can learn from Dan Romik homepage.

DRb2v2

The cover of Dan’s new book with Viazkova’s magic function circled.

Topics in Complex Analysis, also by Dan Romik

This graduate-level mathematics textbook provides an in-depth and readable exposition of selected topics in complex analysis. The material spans both the standard theory at a level suitable for a first-graduate class on the subject and several advanced topics delving deeper into the subject and applying the theory in different directions. The focus is on beautiful applications of complex analysis to geometry and number theory. The text is accompanied by beautiful figures illustrating many of the concepts and proofs.

Among the topics covered are conformal mapping and the Riemann mapping theorem; the Euler gamma function, the Riemann zeta function, and a proof of the prime number theorem; elliptic functions, and modular forms. The final chapter gives the first detailed account in textbook format of the recent solution to the sphere packing problem in dimension 8, published by Maryna Viazovska in 2016 — a groundbreaking proof for which Viazovska was awarded the Fields Medal in 2022.

The entire book can freely be downloaded from the publisher, and section 6 gives a full account of the eight-dimensional sphere packing theorem.

Romik’s Ambidextrous Sofa and more from Romik’s homepage.

Romik’s page on the moving sofa problem contains a lot of information on this problem and links to many discussions including one by John Baez linked above. Here is the pages of Dan’s very recommended 2015 book The Surprising Mathematics of Longest Increasing Subsequences and his recent book An Invitation to MadHat and Mathematical Typesetting  (both are freely available).

Viazovska’s Magic functions

I had a zoom meeting with Dan hoping to get some more information (that I can share with you) about his proof. Instead Dan did not talk much about his work but rather enthusiastically explained to me Viazovska’s breakthrough so let me tell you about that. Very good places to read about it are Viazovska’s original paper The sphere packing problem in dimension 8;  Henry Cohn’s 2016 paper in the Notices of the AMS, A conceptual breakthrough in sphere packing, and a 2016 paper by David de Laat, Frank Vallentin, A Breakthrough in Sphere Packing: The Search for Magic Functions, and Andrei Okounkov’s paper The magic of 8 and 24.

By the work of Cohn and Elkies in order to prove that E_8 gives the densest sphere packing in \mathbb R^8 one needs to find the following “magic function”:

f is a real function on \mathbb R ^8  so that

1) f is radial: f(x) depends only on \|x\|, the distance from x to the origin.

2) f is a Schwartz function: it and all its derivatives decrease faster than the reciprocal of any polynomial.

3) f(0)=\widehat f(0)=1, where \widehat f is the Fourier transform of f.

4) \widehat f(x)\ge 0 for all x \in \mathbb R^8

5) f(x) \le 0 for all x with \|x\| \ge 2.

(Strictly speaking, a non-radial function that satisfies properties (2)-(5) could do the magic, and will also lead to a radial magic function.)

Quite a lot of knowledge of these hypothetical magic functions was gathered and in particular both they and their Fourier transform need to have zeros when the norm of x is an integer of the form \sqrt{2n}. Here is what Dan told me about steps and ideas for Viazovska’s work. Before that Dan reminded me that a modular function of weight k satisfies

g(-1/\alpha)= \alpha^k g(\alpha).

(This property is just part of the definition of modular functions, and for a full definition see Dan’s book or Wikipeda page on modular forms under the heading “Modular forms for SL(2,Z)”.)

So here is some of the main steps and ideas in Viazovska’s construction as I understood from my conversation with Dan.

A) First to consider eigenfunctions for the Fourier transform. One way of generating such eigenfunctions is via an integral formula of the form

\displaystyle \int _0^\infty g(i \alpha) e^{-\pi \alpha |x|^2} d\alpha,

where g(x) is a modular function of weight 4 (for Fourier eigenfunctions in dimension 8).

(A word of explanation: the dual requirement on the zeros of both the function and its Fourier transform is the reason why it’s so useful to consider Fourier eigenfunctions – with such functions, once you manage to force them to have zeros somewhere, you get the same property for the Fourier transform for free.)

B)  Then comes the following bold idea: To insist on the locations of zeros by multiplying with \sin^2(\pi ||x||^2 / 2).. At this point Dan shared his screen and started showing how to continue.

A few words of explanation: From Cohn and Elkies’s work it was known that a (radial) magic function must have zeros on spherical shells of radius \sqrt{2n} for n=1,2,\dots , and that all of these zeros should be of even order (as a function of ||x||), except for the zero at \sqrt 2, which should be of odd order. The simplest way to satisfy these constraints is to look for a magic function with a zero of first order at \sqrt 2 and a double zero at \sqrt 4, \sqrt 6, \dots and the simplest way to achieve that is to multiply by the factor sin^2(\pi ||x||^2 / 2). (This adds some complication, see below.)

C) With the extra sine-squared factor, the condition on the weight function g(x) that would ensure the resulting function is a Fourier eigenfunction is no longer that of a simple modular form of weight 4, and needs to be rethought. By expanding the sine in terms of complex exponentials, one can work out that g(x) needs to satisfy a much more unusual functional equation. Viazovska realized that functions of this type can be constructed using combinations of modular forms of different weights as well as a more exotic type of function known as a quasimodular form.

D) To make everything work it is also necessary to have an analytic continuation argument. This requires, so Dan explained to me, a certain deformation of complex contours that replaces three parallel lines with a figure that looks like a rake.

(Another word of explanation: multiplying by the factor \sin^2(\pi ||x||^2 / 2)  also forces a double zero at \sqrt 2, which is undesirable, and needs to be compensated for by multiplying by a function that has a pole at \sqrt 2 – this is why the extra step of analytic continuation is needed.)

E) After constructing two separate Fourier eigenfunctions, associated with the respective eigenvalues +1 and -1 and each constructed using the appropriate modular and quasimodular forms, Viazovska took their sum to obtain the required magic function.

Viazovska’s magic functions as presented in Dan’s book (and its cover) is:

\displaystyle \varphi_8(x) = \displaystyle -4\sin^2\left(\frac{\pi \lVert x \rVert^2}{2}\right)  \times \displaystyle \int_0^\infty\left(108 \frac{(i t E_4'(it) + 4 E_4(it))^2}{E_4(it)^3-E_6(it)^2}+ 128 \left( \frac{\theta_3(it)^4 + \theta_4(it)^4}{\theta_2(it)^8}+ \frac{\theta_4(it)^4 - \theta_2(it)^4}{\theta_3(it)^8}\right)\right) \displaystyle e^{-\pi \lVert x \rVert^2 t}\,dt

The definitions of the Eisenstein series E_2, E_4, E_6, and Jacobi thetanull functions \theta_2, \theta_3 and \theta_4 is given below. Enjoy these beautiful formulas!

How to prove positivity?

Viavzkova’s brilliant ideas led to a function with zeroes at the right locations. The nonnegativity requirements 4) and 5) were not obvious at all.  To complete the proof it was necessary to prove them.   It was clear to Viazovska that this part can be done numerically and this consists of the computational part of her proof.

The divisor function, Eisenstein series and Jacobi thetanull functions

Here are the definitions of the functions that appear in Viazovska’s magic function. Let us start with the divisor function

\sigma_\alpha(n) = \sum_{d\,|\,n} d^\alpha.

We will also need the definitions of the the Eisenstein series E_2, E_4, E_6, and Jacobi thetanull functions \theta_2, \theta_3 and \theta_4. Put q=e^{\pi i z}.

E_2(z) = \displaystyle 1 - 24 \sum_{n=1}^\infty \sigma_1(n) q^{2n},

E_4(z)  =  \displaystyle 1 + 240 \sum_{n=1}^\infty \sigma_3(n) q^{2n},

E_6(z)  =  \displaystyle 1 - 504 \sum_{n=1}^\infty \sigma_5(n) q^{2n},

\theta_2(z)  =  \displaystyle \sum_{n=-\infty}^\infty q^{(n+1/2)^2},

\theta_3(z)  =  \displaystyle \sum_{n=-\infty}^\infty q^{n^2},

\theta_4(z) = \displaystyle \sum_{n=-\infty}^\infty (-1)^n q^{n^2}.

Viazovska’s inequalies and Dan’s proof

Dan took it as a challenge to prove Viazovska’s two inequalities without using numerical calculations. This was not an easy project and after several unsuccessful attempts he succeeded.

Viazovska’s inequalities

Now we need two  functions:
\displaystyle \phi_0(z)  = \displaystyle 1728 \frac{(E_2(z) E_4(z) - E_6(z))^2}{E_4(z)^3 - E_6(z)^2},

and \displaystyle \psi_I(z)  = \displaystyle 128 \left( \frac{\theta_3(z)^4 + \theta_4(z)^4}{\theta_2(z)^8} + \frac{\theta_4(z)^4 - \theta_2(z)^4}{\theta_3(z)^8} \right),

and need further to define functions A(t), B(t) of a real variable t>0 by
\displaystyle A(t)  = -t^2 \phi_0(i/t) - \frac{36}{\pi^2} \psi_I(it),

and

\displaystyle B(t)  = -t^2 \phi_0(i/t) + \frac{36}{\pi^2} \psi_I(it).

Theorem [Viazovska’s modular form inequalities]:

The functions A(t), B(t) satisfy

  1. \displaystyle A(t) < 0 \qquad (t>0),
  2. B(t) > 0 \qquad (t>0).

I will not say much here about Dan’s proof. The proof of the first inequality is fairly short and it relies on various well-known (but not to me) modular function identities like this standard (!) one

E_4^3 - E_6^2 = \displaystyle 1728 q^2 \prod_{n=1}^\infty (1-q^{2n})^{24},

and an identity by Jacobi

\displaystyle \theta_2^4+\theta_4^4 = \theta_3^4.

The proof of the second inequality is considerably harder.

Posted in Combinatorics, Convexity, Geometry, Number theory | Tagged , , , | 4 Comments

Oberwolfach Workshop: Geometric Algebraic and Topological Combinatorics 2023, part I

This week I participate (remotely) in an exciting Oberwolfach meeting on Geometric Algebraic and Topological Combinatorics. See this post about a whole sequence of related meetings. Let me tell you first about the first two lectures.

Federico Ardila,  Recent developments in the Intersection theory of matroids.

The Chow ring records the intersection of subvarieties of an algebraic variety and Federico talked about four different way to define the Chow ring of a toric variety: One of the ways due to Lou Billera is based on the study of splines. Other ways are related to early works of Sturmfels and Ardila and  Klivans, and quite a few others. (Federico referred to a future survey paper where this notions are explained, and here is an older related survey by him The geometry of geometries: matroid theory, old and new.) Federico explained how the different representations of the Chow ring enable different proofs of combinatorial results such as unimodality of coefficients of chromatic polynomials.

Vic Reiner, Stirling numbers and Koszul algebras with symmetry

We start with combinatorial objects like the classic Stirling number of the first and second kind and move on to graded algebraic objects whose Hilbert functions (that record the dimensions of the graded part)  correspond to these combinatorial objects. (This is the theme of combinatorial commutative algebra.) Next we refine the Hilbert functions in terms of irreducible representations of  a natural group action. (This is the theme of combinatorial representation theory.) Vic discussed all sort of miracles including the Koszul property and the phenomenon of representation stability. Just when Vic talked about the definition of Koszul algebras there were sirens and I had to leave the lecture to a shelter for about five minutes.

Gaku

In the third lecture, “A Regular Unimodular Triangulation of the Matroid Base Polytope” Gaku Liu presented (almost with full details) an ingenious inductive proof that every matroid polytope has a regular unimodular triangulation.

Of course, remote participation in a conference is not as good as real participation. There are many friends among the participants that I would like to talk with about their mathematical and other endeavors, and there are usually quite a few informal academic and other activities. Also the Oberwolfach atmosphere, scenery and library are quite inspiring. (I heard from colleagues also good things about the food.) On the other hand, zoom participating is not as bad as not participating at all, and possibly in remote lectures I find myself less day-dreaming and not paying attention to the talks, but I am not sure about it.

Let me give you a bird-eye view of the talks in the first three sessions. Our meetings in this series are eclectic and I once described them as meetings where each participant is a member of a very-appreciated minority. Naturally, I can be more precise regarding things I am more familiar with and, for the purpose of making this post timely, I will not do justice to speaker’s coauthors and collaborators. There are many things I would love to come back to later in the blog. Any corrections and additions, privately or in the comment section are welcome.

Monday morning

I already mentioned Gaku’s talk about unimodular triangulations and an open problem that I remember is to understand cases where there are such triangulations that are “flag” (no missing faces of dimension 2 or more.)

Monday afternoon

Eran Nevo gave a talk about rigidity expander graphs. Rigidity extends connectivity and expansion is a quantitative form of connectivity. Eran and his partners study the spectral notion of rigidity to d-rigidity. Hailun Zheng, talked about Stress spaces, reconstruction problems and lower bound problems. Certain high dimensional stress spaces describe the g– numbers of polytopes and Hailun and her collaborators answer questions that I raised long ago whether the spece of stresses and the corresponding skeleton, determine the combinatorial structure, and at times, even the affine structure of a simplicial polytope. Patricia Hersh, described a relaxation of the notion of recursive atom ordering that still implies CL-shellability. These terms refer to famous notions of shellability of posets that were especially useful for the study of Coxeter groups. The work provides new avenues for proving shellability.

Tuesday morning

Florian Frick talked about topological methods in zero-sum Ramsey theory. The starting point was a topological proof for the Erdos-Ginzburg-Ziv theorem. Erdos, Ginzburg and Ziv proved that every sequence of elements of ℤn with length at least 2n – 1 contains a subsequence of length n with a zero sum, and this may be the starting point to many results in additive combinatorics and Florian’s lecture may promise topological methods in this area. Pablo Soberon, talked about High-dimensional envy-free partitions. Suppose that you have several measures in say \mathbb R^d. It is a classical question to find partitions (of various kind) to equal parts with respect to all these measures. Envy-free partitions deal with the case that the kth measure represent the utility of an individual, the k-th individual. Envy free partition is a partition where every individial recieves at least as all others according her measures. This is a classic notion in game theory and now envy-free ham sandwiches could be a very nice direction. Kevin Piterman, talked about fixed points on contractible spaces. He emphasized group acting on 2-dimensional contractible spaces.   Here there are classical results and questions by Smith, Oliver, Quillen, Ashbacher, Segev, and others (I know Segev personally) and Kevin presented a solution to a central problem about the existence of fixed points for every finite group acting on compact complex. Roy Meshulam, lecture on random balanced Cayley complexes was a very rich blend of combinatorial, topological, Fourier-theoretical, and algebraic methods.

Let me stop here (publish the post in a drafty mode to be improved later) and continue for the other days in a separate post. In a few minutes our Problem Session starts!

Update: Let me tell you the fascinating last problem of the problem session by Pablo Soberon:

Consider n lines in the plane with no pairs of parallel lines. Define the “measure” of a set A as the maximum number of lines all of their pairwise intersections are in A (or 1 otherwise). Consider a (Voronoi) partition of the plane to three cones A,B,C. The conjecture is that the product of the measures is at least n.

soboren3

Posted in Combinatorics, Conferences | 2 Comments

Progress Around Borsuk’s Problem

I was excited to see the following 5-page paper:

Convex bodies of constant width with exponential illumination number by Andrii Arman, Andrii Bondarenko, and Andriy Prymak

Abstract: We show that there exist convex bodies of constant width in \mathbb E ^n with illumination number at least (\cos (\pi/14)+o(1)^{-n}, answering a question by G. Kalai. Furthermore, we prove the existence of finite sets of diameter 1 in \mathbb E^n which cannot be covered by (2/\sqrt{3}+o(1))^{n} balls of diameter 1, improving a result by J. Bourgain and J. Lindenstrauss.

Let me tell you a little about it. (I will switch from n to d.) Borsuk problem asks for the smallest integer f(d) so that every set of diameter 1 in \mathbb R^d can be covered by f(d) sets of smaller diameter.

The best asymptotic upper bound from 1987 by Oded Schramm is exponential in d,

f(d) \le (\sqrt {3/2}+o(1))^d.

(Note that \sqrt {3/2}=1.2247....) There is also an 1982 elegant upper bound by Marek Lassak

f(d) \le 2^{d-1}+1

There are two different proofs for the upper bound  f(d) \le (\sqrt {3/2}+o(1))^d, and both proofs give stronger results.

The first proof by Oded Schramm is based on covering a set of constant width by smaller homothets of the same set. It has been an outstanding question since the late 80’s if the assertion of Schramm’s result could be strengthened with (\sqrt {3/2}+o(1))^d being replaced with (1+o(1))^d.

Arman, Bondarenko, and Prymak have now showed that this is not the case: they constructed a set of diameter 1 that cannot be covered by less than

(1. 026.. + o(1))^d

homothets of smaller diameter. Here 1.026.. stands for 1/cos(\pi /14).

The second 1991 proof for the upper bound  f(d) \le (\sqrt {3/2}+o(1))^d, by Jean Bourgain and Joram Lindenstrauss is based on covering a set of diameter 1 with balls of diameter 1. (A ball of diameter 1 can be covered by d+1 balls of smaller diameter.) For covering by balls, it was known that the upper bound cannot be pushed down to (1+o(1))^d; there was an old result from the mid 1960s by Danzer showing that we need at least 1.001^d balls of diameter 1 to cover an arbitrary set of diameter 1 in \mathbb R^d. In their paper along with the upper bound, Bourgain and Lindenstrauss improved Danzer’s 1.001^d to 1.0645^d. The second theorem of the paper improves 1.0645 to 2/\sqrt 3 which is 1.1547.

ABP

Some more comments:

  1. The first (main) result of Arman, Bondarenko, and Prymak was already improved by Alexey Glazyrin in his 4-page Note on illuminating constant width bodies who improved the constant 1.026… to \approx1.047… which stands for \frac {1}{4} \sqrt {\frac {1}{6}(111-\sqrt {33})}.
  2. Schramm’s result as well as the new result are formulated in terms of illumination numbers. “A boundary point x of a convex n-dimensional body B is said to be illuminated by a direction (unit vector) \ell if the ray from x in the direction of \ell  intersects the interior of B. The set of directions \cal L  illuminate B if each boundary point of B is illuminated by some direction from \cal L. A natural question is to determine the minimal size of an illuminating set, also called an illumination number, for a given B or for a given class of convex n-dimensional bodies.” The illumination number of a convex set B equals the number of homothets of smaller diameter needed to cover B.
  3. Let K be a set of diameter one. Let {\cal B}(K) be the class of sets of diameter one of the form \alpha K + \beta B.  It is interesting if by using such sets (of diameter 1) the upper bound of  (\sqrt {3/2}+o(1))^n could be pushed down to (\sqrt {3/2}-\delta)^n, for some \delta >0. It is also interesting if we can find a lower bound for covering with such sets of the form (1+\delta)^d.
  4. We had several posts about and around Borsuk’s problem. But I missed lot of very nice results and questions, in the blog and even in my survey article from 2015. 
  5. Here is a quick summary of the best known (asymptotic) lower bounds for Borsuk’s problem and related problems. The Frankl-Wilson bound for the double cap conjecture is 1.13975...^{-n} and Raigorodskii’s (world record) lower bound is 1.1547...^{-n}. Raigorodskii’s (world record) bound for the double cup conjecture gives a bound of 1.225^{\sqrt d} for the Borsuk problem. 
  6. Borsuk conjectured that f(d)=d+1, namely, every set of diameter 1 can be covered by d+1 sets of smaller diameter.  The smallest dimension where this is known to be false is 64 and this was proved by Thomas Jenrich. 
  7. Schramm asked if there are convex sets of constant width with exponentially smaller volume compared to a ball of the same width. (See this mathoverflow question.) He also asked the question for spherical sets. A negative answer will push  (as far as I remember) the constant in his bound from \sqrt {3/2} to \sqrt {4/3}=1.155..).

Glazyrin

Posted in Convexity, Geometry, Music | Tagged , , , , , , , , , , | 1 Comment