“251 Mercer street” I said, “it is in the village,” “I know that” said the driver “are you going to NYU?” “Yes” I said. I was going to give a lecture at Ricky and Eli’s Tuesday evening’s Geometry Seminar and came earlier to have a chance to chat with the guys. “Let me tell you something” the driver said, “I did not finish elementary school, I left in the 6th grade and started working, and everyday I come home and have sex. My friends who finished elementary school have sex 2-3 times a week, tops. And , you know,” continued the driver “those who went on to finish high school, they have sex maybe once a week, no more.” “I see,” I said. Continue reading

# The Simplified Pill Algorithm

Ok so I had to take one pill every day, and fortunately the pill package was marked, it contained 14 pills with labels Sunday, Monday and so on. It started with Sunday, which was the mark for the leftmost pill. The pharmacy gave me 30 pills every month, two packages and 2 separately, which meant that taking the pills would be out of synchronization.

However, I had it all figured out: All I need is to remember a single number between -3 and 3 for the entire four weeks. Say the starting day is Tuesday. Then I had to remember the number -2. On Tuesday I take the leftmost pill for Sunday, on Wednesday I take the pill for Monday, on Thursday I take the pill for Tuesday,  and so on. If, for example, the starting date is Thursday, then I have to remember +3. The pill for Sunday I take on Thursday, and then the pill for Monday I take on Friday etc. Easy. (I tried to write this number of the package but this didn’t work.)

I can remember several 7 digit phone numbers so, in theory, I did not see any difficulty in remembering one small integer number for an entire month. The math itself was not too challenging.

In practice it did not work so well. Sometimes I could not remember: either I already took the pill and the magic number is 2 or I did not and it is 3. Other times, I wondered if the magic number +2 or -2.

# Sarkaria’s Proof of Tverberg’s Theorem 2

Karanbir Sarkaria

## 4. Sarkaria’s proof:

Tverberg’s theorem (1965): Let $v_1, v_2,\dots, v_m$ be points in $R^d$, $m \ge (r-1)(d+1)+1$. Then there is a partition $S_1,S_2,\dots, S_r$ of $\{1,2,\dots,m\}$ such that $\cap _{j=1}^rconv (v_i: i \in S_j) \ne \emptyset$.

Proof: We can assume that $m=(r-1)(d+1)+1$. First suppose that the points $v_1, v_2,\dots,v_m$ belong to the $d$-dimensional affine space $H$ in $V=R^{d+1}$ of points with the property that the sum of all coordinates is 1. Next consider another $(r-1)$-dimensional vector space $W$ and $r$ vectors in $W$, $w_1,w_2, \dots, w_r$  such that $w_1+w_2+\dots +w_r = 0$  is the only linear relation among the $w_i$s. (So we can take $w_1,\dots w_{r-1}$ as the standard basis in $R^{r-1}$ and $w_r=-w_1-w_2-\dots-w_r$. )

Now we consider the tensor product $V \otimes W$.

Nothing to be scared of: $U=V \otimes W$ can be regarded just as the $(d+1)(r-1)$-dimensional vector space of d+1  by $r-1$ matrices. We can define the tensor product  $x \otimes y$ of two vectors $x = (x_1,x_2,\dots,x_{d+1}) \in V$ and $y =(y_1,y_2,\dots,y_{r-1}) \in W$, as the (rank-one) matrix $(x_i \cdot y_j)_{1 \le i \le d+1,1 \le j \le r-1}$.

Consider in U the following sets of points:

$S_1 = \{v_1 \otimes w_j: j=1,2,\dots r \}$

$S_2 = \{v_2 \otimes w_j: j=1,2,\dots r \}$

$S_i = \{v_i \otimes w_j: j=1,2,\dots r \}$

$S_m = \{v_m \otimes w_j: j=1,2,\dots r \}.$

Note that 0 is in the convex hull of every $S_i$. Indeed it is the sum of the elements in $S_j$. (And thus $0=1/r(v_i \otimes w_1+ v_i \otimes w_2 + \dots + v_i \otimes w_r)$. )

By the Colorful Caratheodory Theorem we can  Continue reading

# Sarkaria’s Proof of Tverberg’s Theorem 1

Helge Tverberg

Ladies and gentlemen, this is an excellent time to tell you about the beautiful theorem of Tverberg and the startling proof of Sarkaria to Tverberg’s theorem (two parts). A good place to start is Radon’s theorem.

## 1. The theorems of Radon, Helly, and Caratheodory.

Radon’s theorem: Let $x_1,x_2,\dots, x_m$ be points in $R^d$, $m \ge d+2$. Then there is a partition $S,T$ of $\{1,2,\dots,m\}$ such that $conv(x_i: i \in S) \cap conv(x_i: i \in T)$ $\ne \emptyset$.  (Such a partition is called a Radon partition.)

Proof: Since $m>d+1$ the points $x_1,x_2, \dots, x_m$ are affinely dependent. Namely, there are coefficients, not all zero,  $\lambda_1,\lambda_2,\dots,\lambda_m$ such that $\sum _{i=1}^m \lambda_i x_i = 0$ and $\sum _{i=1}^m \lambda_i=0$. Now we can write $S=\{i:\lambda_i >0\}$ and $T = \{i: \lambda_i \le 0\}$, and note that

(*) $\sum_{i \in S} \lambda_i x_i = \sum _{i \in T} (-\lambda_i) x_i$,

and also $\sum _{i \in S} \lambda_i = \sum_{i \in T} (-\lambda_i)$. This last sum is positive because not all the $\lambda$s are equal to zero. We call it $t$.

To exhibit a convex combination of $x_i: i\in S$ which is equal to a convex combination in $x_i: i \in T$ just divide relation (*) by $t$. Walla.

This trick of basing a partition on the signs of the coefficients repeats in other proofs. Take note!

Radon used his theorem to prove Helly’s theorem.

### Helly’s theorem

Helly’s theorem: For a family $K_1,K_2,\dots, K_n$, $n \ge d+1$, of convex sets in $R^d$, if every $d+1$ of the sets have a point in common then all of the sets have a point in common.

Proof: It is enough to show that when $n> d+1$ if every $n-1$ of the sets have a point in common then there is a point in common to them all. Let $x_k \in \cap\{K_j: 1\le j\le n, j \ne k\}$. In words, $x_k$ is a point that belongs to all the $K_j$s except perhaps to $K_k$. We assumed that such a point $x_k$ exists for every $k$.

Now we can apply Radon’s theorem: Consider the Radon partition of the points, namely a partition $S,T$ of $\{1,2,\dots,m\}$ such that $conv(x_i: i \in S) \cap conv(x_i: i \in T)$ $\ne \emptyset$.  Let $z \in$ $conv(x_i: i \in S) \cap conv (x_i: i \in T)$. Since $z$ belongs to the convex hull of  $conv (x_i: i \in S)$ and every $x_i$ for $i \in S$ belongs to every $K_j$ for every $j$ not in $S$ we obtain that $z$ belongs to every $K_j$ for $j$ not in $S$. By the same argument $z$ belongs to every $K_j$ for $j$ not in $T$. Since $S$ and $T$ are disjoint, the point $z$ belongs to all $K_j$s. Ahla.

The proof of Helly’s theorem from Radon’s theorem as described on the cover of a book by Steven R. Lay

### Caratheodory’s theorem

Caratheodory’s theorem: For $S \subset R^d$, if $x \in conv (S)$ then $x \in conv (R)$ for some $R \subset S$, $|R| \le d+1$.

Like for Radon’s theorem, there is a simple linear algebra proof. We can assume that $S$ is finite; we start with a presentation of $x$ as a convex combination of vectors in $S$, and if $|S|>d+1$ we can use an affine dependency among the vectors in $S$ to change the convex combination, forcing one coefficient to vanish.

Without further ado here is Tverberg’s theorem.

## 2. Tverberg’s Theorem

Tverberg Theorem (1965):Let $x_1,x_2,\dots, x_m$ be points in $R^d$, $m \ge (r-1)(d+1)+1$. Then there is a partition $S_1,S_2,\dots, S_r$ of $\{1,2,\dots,m\}$ such that $\cap _{j=1}^rconv (x_i: i \in S_j) \ne \emptyset$.

So Tverberg’s theorem is very similar to Radon’s theorem except that we want more than two parts! Continue reading

A transparency and a lecture using transparencies. (No relation to the advice.)

### Worst – The same as above except that the transparencies are handwritten in an unreadable way.

Q: What do Yehuda Agnon, Michael Ben-Or, Ehud Lehrer, Uzi Segal (whose calibration theorem was mentioned in the post on the controversy around expected utility theory), Mike Werman, myself, and quite a few others, all have in common?

Hint:

Answer: We all, (and altogether more than 20 students) took part in Continue reading

# Thomas Bayes and Probability

How can we assign probabilities in cases of uncertainty?  And what is the nature of probabilities, to start with?  And what is the rational mechanism for making a choice under uncertainty?

Thomas Bayes lived in the eighteenth century.  Bayes’ famous formula shows how to update probabilities given some new evidence. Following is an example for an application of Bayes’ rule:

Suppose that ninety percent of pedestrians cross a certain crosswalk when the light is green, and ten percent cross it when the light is red. Suppose also that the probability of being hit by a car is 0.1% for a pedestrian who crosses on a green light, but the probability of being hit by a car is 2% for a pedestrian who crosses on a red light. A pedestrian is hit by a car at this particular crossing and brought to the hospital. How likely is it that he crossed on a red light?

Well, to start with (or a priori), only ten percent of the people who cross the crosswalk cross it on a red light, but now that we are told that this person was hit by a car it makes the probability that he crossed illegally higher. But by how much? Bayes’ rule allows us to compute this (a posteriori) probability. I will not describe the mathematical formula, but I will tell you the outcome: the probability that this person crossed on a red light is 2/3.

The Bayesian approach can be described as follows. We start by assigning probabilities to certain events of interest and, as more evidence is gathered, we update these probabilities. This approach is applied to mundane decision-making and also to the evaluation of scientific claims and theories in philosophy of science.

Bayes’ rule tells us how to update probabilities but we are left with the question of how to assign probabilities in cases of uncertainty to begin with. What is the probability of success in a medical operation? What is the chance of your team winning the next baseball game? How likely is it that war will break out in the Middle East in the next decade? How risky are your stock-market investments? Continue reading

The following paragraph is taken from the original “too personal for publication draft” of an article entitled ” ‘Final values’ of functors” by Shmuel Weinberger for a volume in honor of Guido Mislin’s retirement from ETH. (L’enseignement Mathematique 54(2008), 180-182.) Shmuel’s remarks about making conjectures and the different types of conjectures appear here for the first time.

“Final Values” of Functors

Shmuel Weinberger

A personal letter for, and to, Guido Mislin, a mathematician who always epitomized for me elegance, taste, and precision.

Guido Mislin

I have, on very rare occasions, conjectured things I really believe with evidence by analogy and the religious belief in the essential simplicity of the mathematical universe (looked at correctly)[1].  Besides beauty, such a conjecture[2] pragmatically serves as a guide through unknown landscape.  And even an unreliable guide helps to point in the right direction. In fact, an unreliable guide known to be unreliable can be useful indeed.  Sometimes one makes conjectures knowing them to be false, but feeling that their falsity is a deep phenomenon and most of the predictions made with the conjecture as guide will be true[3].  On other times, I have conjectured to lay down the gauntlet:  “See, you can’t even disprove this ridiculous idea.” [4] On yet other times, the conjectures come from daydreams:  it would be so nice if this were true[5].  And, yet on others one makes a conjecture hoping to probe the landscape that other conjectures have already illuminated[6].

The conjecture (and speculations) that I would like to present to you, Guido, is motivated by ideas I learnt from Goodwillie and Weiss, but I think it is also much in the spirit of the way you sometimes approach mathematics and it is somewhere between the last two kinds…

[1] These include the package of conjectures about homology manifolds made in [BFMW], which incidentally flatly contradict other standard conjectures like the Bing-Borsuk conjecture.

[2] Here I have in mind things like the Riemann hypothesis, indeed huge swaths of number theory; in topology and geometry, Thurston’s geometricization conjecture, Baum-Connes conjecture and Novikov conjectures, and so on are examples.

[3] Here I am thinking of the equivariant version of the Borel conjecture, or the stratified version. See [W, Chapter 13].

[4] I think this is what Lott had in mind (although he was usually careful to call it a question) — but I have less concern for my reputation, and am willing to conjecture the opposite of what I believe -when he formulated the zero-in-the-spectrum problem for all complete manifolds.  And it was disproved in [FW].  On the other hand, my dear friend and mentor Donald Newman made a conjecture of this same sort once in rational approximation, only to have it proven by V.A.Popov.

[5] This must be the case for, say, Gromov’s questions about large Riemannian manifolds [G] or the Weinberger conjecture discussed first in print in [D]

[6] The zero in the spectrum problem for universal covers of aspherical manifolds or even uniformly contractible manifolds, also discussed by Lott, is of this sort.  The conjecture of rationality of the difference of twisted APS invariants for homotopy equivalent manifolds was made after realizing that it would follow from the Borel conjecture for torsion free groups.  By now, it’s been proven three times.  [FLW][K][GHW] and [HR].

As further afterthoughts Shmuel writes: Continue reading

# Would you decide the election if you could?

One mental experiment I am fond of asking people (usually before elections) is this:

Suppose that just a minute before the votes are counted you can change the outcome of the election (say, the identity of the winner, or even the entire distribution of ballots) according to your own preferences. Let’s assume that this act will be completely secret. Nobody else will ever know it.

Will you do it?

You can answer in this poll:

# Impagliazzo’s Multiverse

Update (July 2009): Here are links to a related post on Lipton’s blog, and a conference announcement on Russell’s possible worlds.

On the occasion of Luca’s post on his FOCS 2008 tutorial on average-case complexity here is a reminder of Russell Impagliazzo’s description of the possible universes we live in.

Being at FOCS was a splendid experience and my presentation of the paper of Ehud Friedgut, Noam Nisan and myself was well accepted. (And reported by Rahul Santhanam on “Computational Complexity“) The slides are here; Initially I felt bad for including only one out of three parts of the proof but eventually I had no time to describe any part of the proof.

## Russell Impagliazzo’s multiverse – cryptography and complexity

Impaglliazzo’s paper deals with two central topics in theoretical computer science: Computational complexity studies the power of computers and cryptography studies the ability to create codes and to hide information. These two areas added beautiful new scientific insights, and the deep connections between them were described by computer scientist Shafi Goldwasser as “a match made in heaven.” The limited power of computers and the well known question “whether P=NP” are of central importance in Impaglliazzo’s classification,  as are the concepts of “one way functions” and “public keys”.

### ALGORITHMICA –

In this universe computers can quickly solve hard problems.  (In technical terms: P=NP.)  If you can quickly recognize a valid solution to a problem once the latter is presented to you,  (such problems are “in NP”), then you can apply this same method in order to actually find a solution! In this universe almost all optimization problems have a quick and automatic solution. No cryptography is possible and therefore there is no privacy.

Muhammad ibn Musa al-Khwarizmi (the term “algorithm” is called after his name, and not after All Gore as commonly believed)

# Detrimental Noise

### John Lennon

Disclaimer: It is a reasonable belief  (look here, and here), and an extremely reasonable working assumption (look  here) that computationally superior quantum computers can be built.

(This post and the draft will be freely updated) I am struggling to meet the deadline of a week ago for a chapter regarding adversarial noise models for quantum error correction. (Update Nov 20: here is the draft; comments are welcomed. Update April 23: Here is the arxived paper, comments are welcome! ) My hypothetical model is called “detrimental.” (This is a reason substantial math postings are a bit slow recently; but I hope a few will come soon.) This project is quite central to my research in the last three years, and it often feels like running, over my head, after my own tail that might not be there. So this effort may well be a CDM (“career damaging move”) but I like it nevertheless. It is related to various exciting conceptual and foundational issues.

I do have occasionally a sense of progress, (often followed by a backtrack) and for this chapter, rather than describing detrimental noise by various (counterintuitive) properties as I always did, I think I have an honest definition of detrimental noise. Let me tell you about it. (Here is a recent useful guide: a zoo of quantum algorithms, produced by Stephen Jordan.)

## Detrimental noise

Consider a quantum memory with $n$ qubits at a state $\rho_0$. Suppose that $\rho_0$ is a tensor product state. The noise affecting the memory in a short time interval can be described by a quantum operation $E_0$. Lets suppose that $E_0$ acts independently on different qubits and, for qubit $i$ with some small probability $p_i$$E_0$ changes it state to the maximum entropy state $\tau_i$.

This is a very simple form of noise that can be regarded as basic to understanding the standard models of noise as well as of detrimental noise.

In the standard model of noise, $E_0$ describes the noise of the quantum memory regardless of the state $\rho$ stored in the memory. This is a quite natural and indeed expected form of noise.

A detrimental noise will correspond to a scenario in which, when the quantum memory is at a state $\rho$ and $\rho= U \rho_0$, the noise $E$ will be $U E_0 U^{-1}$. Such noise is the effect of first applying $E_0$ to $\rho_0$ and then applying $U$ to the outcome noiselessly.

Of course, in reality we cannot perform $U$ instantly and noiselessly and the most we can hope for is that $\rho$ will be the result of a process. The conjecture is that a noisy process leading to $\rho$ will be subject to noise of the form we have just described. A weaker weaker conjecture is that detrimental noise is present in every natural noisy quantum process. I also conjecture that damaging effects of the detrimental noise cannot be canceled or healed by other components of the overall noise.When we model a noisy quantum system either by a the qubits/gates description or in other ways we make a distinction between “fresh” errors which are introduced in a single computer cycle (or infinitesimally when the evolution is described by a continuous model) and the cumulative errors along the process. The basic insight of fault tolerant quantum computing is that if the incremental errors are standard and sufficiently small then we can make sure that the cumulated errors are as well. The conjecture applies to fresh errors.

(Updated: Nov 19; sorry guys, the blue part is over-simplified and incorrect; But an emergency quantifier replacement seemed to have helped; it seems ok now)  The definition of detrimental noise for general quantum systems that we propose is as follows:

A detrimental noise of a quantum system at a state $\rho$ commutes with every some non-identity quantum operation which stabilizes $\rho$.

Note that this description,

Just like for the standard model of noise, we do not specify a single noise operation but rather gives an envelope for a family of noise operations.

In the standard model of noise the envelope $\cal D_{\rho}$ of noise operations when the computer is at state $\rho$ does not depend on $\rho$. For detrimental noise there is a systematic relation between the envelope of noise operations ${\cal D}_\rho$ and the state $\rho$ of the computer. Namely,

${\cal D}_{U\rho} = U {\cal D}_\rho U^{-1}$.

## Why is it detrimental?

Detrimental noise leads to highly correlated errors when the state of the quantum memory is highly entangled. This is quite bad for quantum error-correction, but an even more devastating property of detrimental noise is that the notion of “expected number of qubit errors” becomes sharply different from the rate of noise as measured by fidelity or trace distance. Since conjugation by a unitary operator preserves fidelity-metric, the expected number of qubit errors increases linearly with the number of qubits for highly entangled states.

Here is another little thing from my paper that I’d like to try on you:

## A riddle: Can noise remember the future?

Suppose we plan a process and carry it out up to a small amount of errors. Can there be a systematic relation between the errors at some time and the planned process at a later time? Continue reading