Category Archives: Information theory

A Discussion and a Debate

Heavier than air flight of the 21 century?

The very first post on this blog entitled “Combinatorics, Mathematics, Academics, Polemics, …” asked the question “Are mathematical debates possible?” We also had posts devoted to debates and to controversies.

A few days ago, the first post in a discussion between Aram Harrow, a brilliant computer scientists and quantum information researcher (and a decorated debator), and myself on quantum error correction was launched in Dick Lipton and Ken Regan’s big-league blog, Gödel’s Lost Letter and P=NP.

The central question we would like to discuss is:

Are universal quantum computers based on quantum error correction possible.

In preparation for the public posts, Ken, Aram, Dick, and me are having very fruitful and interesting email discussion on this and related matters, and also the post itself have already led to very interesting comments. Ken is doing marvels in editing what we write.

Dick opened the post by talking on perpetual motion machines which is ingenious because it challenges both sides of the discussion. Perpetual motion turned out to be impossible: will quantum computers enjoy the same fate? On the other hand (and closer to the issue at hand), an argument against quantum mechanics based on the impossibility of perpetual motion by no other than Einstein turned out to be false, are skeptical ideas to quantum computers just as bogus? (The answer could be yes to both questions.) Some people claimed that heavier-than-air flight might is a better analogy. Sure, perhaps it is better.

But, of course, analogies with many human endeavors can be made, and for these endeavors, some went one way, and some went the other way, and for some we don’t know.

Although this event is declared as a debate, I would like to think about it as a discussion. In the time scale of such a discussion what we can hope for is to better understand each other positions, and, not less important, to better understand our own positions.  (Maybe I will comment here about some meta aspects of this developing discussion/debate.)

A real debate

A real emerging debate is if we (scientists) should boycott Elsevier. I tend to be against such an action, and especially against including refereeing papers for journals published by Elsevier as part of the boycott. I liked, generally speaking,  Gowers’s critical post on Elsevier, but the winds of war and associated rhetoric are not to my liking.  The universities are quite powerful, and they deal, negotiate and struggle with scientific publishers, and other similar bodies, on a regular  basis. I tend to think that the community of scientists should not be part of such struggles and that such involvement will harm the community and science. This is a real debate! But it looks almost over.  Many scientists joined the boycott and not many opposing opinions were made. It looks that we will have a little war and see some action. Exciting, as ever.

Emmanuel Abbe: Erdal Arıkan’s Polar Codes

Click here for the most recent polymath3 research thread. A new thread is comming soon.

Emmanuel Abbe and Erdal Arıkan

This post is authored by Emmanuel Abbe

A new class of codes, called polar codes, recently made a breakthrough in coding theory.

In his seminal work of 1948, Shannon had characterized the highest rate (speed of transmission) at which one could reliably communicate over a discrete memoryless channel (a noise model); he called this limit the capacity of the channel. However, he used a probabilistic method in his proof and left open the problem of reaching this capacity with coding schemes of manageable complexities. In the 90’s, codes were found (turbo codes and LDPC rediscovered) with promising results in that direction. However, mathematical proofs could only be provided for few specific channel cases (pretty much only for the so-called binary erasure channel). In 2008,  Erdal Arıkan at Bilkent University invented polar codes, providing a new mathematical framework to solve this problem.

Besides allowing rigorous proofs for coding theorems, an important attribute of polar codes is, in my opinion, that they bring a new perspective on how to handle randomness (beyond the channel coding problem). Indeed, after a couple of years of digestion of Arıkan’s work, it appears that there is a rather general phenomenon underneath the polar coding idea. The technique consist in applying a specific linear transform, constructed from many Kronecker products of a well-chosen small matrix, to a high-dimensional random vector (some assumptions are required on the vector distribution but let’s keep a general framework for now). The polarization phenomenon, if it occurs, then says that the transformed vector can be split into two parts (two groups of components): one of maximal randomness and one of minimal one (nearly deterministic). The polarization terminology comes from this antagonism. We will see below a specific example. But a remarkable point is that the separation procedure as well as the algorithm that reconstructs the original vector from the purely random components have low complexities (nearly linear). On the other hand, it is still an open problem to characterize mathematically if a given component belongs to the random or deterministic part. But there exist tractable algorithms to figure this out accurately.

Let us consider the simplest setting. Let X_1,...,X_n be i.i.d. Bernoulli(p) and assume that $n$ is a power of 2. Define G_n to be the matrix obtained by taking \log_2(n) Kronecker products of G_2=\begin{pmatrix}1&0\\1&1\\\end{pmatrix} , and multiply X=(X_1,...,X_n) with G_n over GF(2) to get U=(U_1,...,U_n). Note that U has same entropy as X, since G_n is invertible (its inverse is itself). However, if the entropy of X is uniformly spread out over its components, i.e., H(X)=nH(X_1), the entropy of U may not be, since its components are correlated. In any case, we can write H(U)= \sum_i H(U_i|U^{i-1}) where U^{i-1}=(U_1,...,U_{i-1}) are the `past components’. The polarization phenomenon then says that, except for a vanishing fraction of indices i (w.r. to n), each term H(U_i|U^{i-1}) tends to either 0 or 1.

Continue reading