## Four Derandomization Problems Polymath4 is devoted to a question about derandomization: To find a deterministic polynomial time algorithm for finding a k-digit prime.  So I (belatedly) devote this post to derandomization and, in particular, the following four problems.

1) Find a deterministic algorithm for primality

2) Find a combinatorial characterization (and a deterministic algorithm) for generically 3-rigid graphs

3) Find an explicit construction for Ramsey graphs

4) Find a deterministic method for polling

(Remark: I was very slow in writing this post, and I had to give up elaborating the details on some very nice topics in order to finish it.)

## Primality

The famous randomized algorithms for primality are still probably the quickest ways to check if a number is prime. When Rabin’s algorithm, for example, gives the answer “yes the number n is a prime” you can be sure with high probability that this is indeed the case.  The probability refers to internal randomization of the algorithm.

The randomized algorithms are examples of mathematical proofs “beyond reasonable doubts”. There are mathematical statements that can be proved “with high probability,” and this is not a heuristic notion but a rigorous notion based on some internal randomization on which the proof depends. It is a rigorous notion for a single mathematical statement and not a probabilistic notion over a space of many such statements.  There are not many randomized proofs for mathematical theorems, but finding such proofs is an exciting possibility.

Two decades after the first randomized algorithms for primality by Solovay-Strassen and Rabin (which are referred to as “Monte-Carlo” tests), the first “Las Vegas” primality algorithm was found by Goldwasser and Kilian. (An extention of the algorithm to all primes was achieved by Adleman and Huang.)  For this algorithm, which is based on elliptic curves, there is a small probability that the algorithm will not give an answer at all. But when the algorithm declares a number to be prime then the number is prime for sure!  Another decade passed before the first deterministic polynomial algorithm was found by Manindra Agrawal, Neeraj Kayal and Nitin Saxena. Practically, Rabin’s and others’ randomized methods are still much better and they give the main example of genuine mathematical proofs “beyond a reasonable doubt”. You can read more about primality tests in this wikipedia article, and read also the wikipedea article about primality certificate. and the AKS deterministic algorithm for primality is described in this post by Terry Tao (among other places). It was a beautiful combination of number theoretic ideas and derandomization techniques. In particular it was based on derandomizing of a randomized algorithm discovered a few years eralier by Agrawal.

## Rigidity of graphs and identity testing

Let G be an abstract graph with n vertices. A spatial framework is an embedding of the vertices of G into space.  The embedding is flexible if there is a non-trivial flex, i.e., a non-trivial perturbation of the embedded vertices that keeps all the edge lengths fixed. Here, a trivial flex means a flex that comea from a rigid motion of the entire space. In other words, a trivial flex keeps the distances between every pair of embedded vertices fixed. An embedding is rigid if it is not flexible. There is a related linear notion of infinitesimal flex – an assignment of velocity vectors to the vertices so that the edge length is constant to the first degree.

A graph G is generically 3-rigid if it is rigid for a generic embedding into space.

A necessary and sufficient condition for generic rigidity is that there exists an embedding for which the graph is infinitesimally rigid. Testing infinitesimal rigidity for a specific embedding amounts to solving a system of linear equations so it is computationally easy. This gives a simple randomized algorithm to decide if a graph is generically 3-rigid. Generic 3 rigidity amounts to a certain polynomial P with interetminate coefficients not being identically zero. The polynomial P is in fact a determinant of a 3n-6 by 3n – 6 matrix, where n is the number of vertices of the graph. One of the most basic randomized algorithms is based on showing that this can be checked by testing the value of the polynomial P at sufficiently many integer points.

This use of randomnessin this case  is a central (perhaps the central) method  in the area of randomized algorithms. The problem is  referred to as “identity testing”. You are given a polynomial P whose coefficents are indeterminate.

Question (Identity testing): Is P identically zero?

Algorithm:  Substitute “enough” random integer values for the coefficients. If you get always zero then with high probability P is identically zero. (Of course, if you get a nonzero answer even once then P is not identically zero.)

(What is “enough?” look at this interesting post by Dick Liption.) Identity testing is a fundamental quastion (perhaps the fundamental question) where randomness helps and where we do not have a clue what to do without randomization.

Let us discuss derandomization of this problem. It is not known if there a combinatorial description of generically 3-rigid graphs and if there exists a deterministic polynomial-time algorithm to decide if a graph is 3-rigid.

Now let’s look at dimension 1 and 2. A graph is generically 1-rigid if it is connected and this is quite a basic property.  In two dimensions there is a simple combinatorial condition by Laman: A graph with n vertices is generically 2-rigid and is minimal with respect to this property, if and only if, it has 2n-3 edges, and every subgraph on m vertices has at most 2m-3 edges.

For 3 dimensions the critical number of edges is 3n-6, but the double-banana example  shows that Laman’s condition does not extend. The double-banana example

## Ramsey graphs

A k-Ramsey graph is a graph with $n$ vertices which has neither a complete subgraph on k vertices nor an empty subgraph on k vertices. A famous result of Erdos asserts that a random graph with \$2^{k/2}\$ vertices is k-Ramsey with positive probability. This is one of the earlier applications of the “probabilistic method” in combinatorics. To find explicit constructions for Ramsey graphs has been an outstanding open problem since the discovery of the random examples. (Computational complexity allows a formal way to state this question.) Until recently the best known examples were based on the Frankl-Wilson theorem. They gave explicit examples in which roughly $n= k^{\log k}$. Recently even better construction giving $\log n = ({\log k})^t$ for an arbitrary positive $t$ were found by Barak, Rao, Shaltiel, and Wigderson .They use in a very difficult way sum-product theorems (especially the sum product theorem of Bourgain, Katz and Tao,) together with various deep derandomization methods.

## Polling

One of the surprising insights of statistics is that the sample size required for a public election poll does not depend on the size of the population. Can you derandomize polling? Practically, you surely can, and pseudo-random generators are commonly used without hesitation. But is there a deterministic substitute?

Let me raise, as some provocative food for thought,  the analogy between finding deterministically a prime which was polymath4’s aim and deterministically polling.

## All sort of things

What else did I want to talk about.  The randomization revolution in complexity theory: This is a wonderful story. Some buzz words are: Interactive proofs, zero knowledge proofs, PCP and hardness of approximation, rapid mixing algorithms, randomness in distributed computing, and more.  Choose a random post on Lipton’s blog and with high probability you will meet an important aspect of randomization.

The Probabilistic method: The probabilistic method refers to the method of using probabilistic techniques for problems which seems to be completely unrelated to probability. Applying probabilistic methods revolutionized or led to major developments in many mathematical areas. We already talked about combinatorics, and computer science (probabilistic issues are of great importance also in cryptography), probability also entered in unexpected ways in number theory, Banach space theory, real analysis, game theory, group theory and maybe also geometry and topology. If in many areas the probabilistic method entered surprisingly in different times, how could it be surprising after all?

Why derandomization? There is no question that randomization and randomized algorithm completely changed the computational complexity landscape. What is the motivation for derandomization? One can argue that random bits are practically not really a computational resource we have to worry about like time and space. “Yes, right” one can answer, this is correct but it is a manifestation of the deep and fundamental conjecture that derandomization is possible via pseudo-random generators. A main reason to study derandomization, in my opinion,  is that it is a profound and novel study of the notion of randomness, a scientific/mathematical notion that baffled scientists for centuries. A second reason is that as randomization itself has become of primary importance in complexity theory; derandomozation too has more and more applications. One example is the Zig-Zag products which played a role in Omer Reingold’s recent proof of SL=L and also, in a way, in Irit Dinur’s new approach to PCP.

Hardness vs deradomization:  This is a remarkable tradeoff that was discovered in recent decades and led to a wider belief (but is very far from offering a proof) that randomness does not drastically help (BPP=P).

How to test randomness: Does theoretical computer science give a better understanding of the notion of randomness compared to traditional statistics’ approach? Well, this is a matter of debate. An interesting debate!

What is probability, anyway. Not so many mathematicians, even probabilists, are interested these days in the foundations of probability theory. For example, does (classical) probability manifests only human uncertainty and has no other “objective” meaning?

This entry was posted in Computer Science and Optimization, Probability and tagged , , . Bookmark the permalink.

### 7 Responses to Four Derandomization Problems

1. rjlipton says:

Your question on polling is very interesting. As a practical issue many “people” do not trust random methods. Of course they usually are from other areas and do not realize the huge power of proper use of randomness. I wonder if there were deterministic methods that were practical for polling, or related problems, that other areas would not use them? I have personal experience trying to get friends in say an area like systems to use more randomness and they are reluctant to use it.

Also thanks for pointer to my comments.

2. commenter says:

The randomized identity testing approach to testing is from Agrawal and Biswas.

The Ramsey bound should say log n = (log k)^t .
To restate in familiar terms, for arbitrary epsilon, they get “explicit” graphs on n vertices with no clique or independent set of size >
2^{log n^{epsilon}}.

GK: Thanks for the remarks! The typo is now corrected.

3. Gil Kalai says:

Major progress on explicit Ramsey for the complete and for the bipartite case is reported in this post. https://gilkalai.wordpress.com/2015/08/14/updates-and-plans-iii/. Another major result is the (almost complete) derandomization of the isolation lemma leading to quasi-NC deterministiv algorithm for Matching http://eccc.hpi-web.de/report/2015/177/ .

4. GFP says:

I think the bound on the random construction should be 2^{k/2}.

GK: Dear GFP, thanks corrected. 2k k/2 how can you tell them apart?