Let be the -dimensional cube. Turn into a torus by identifying opposite facets. What is the minumum -dimensional volume of a subset of which intersects every non-trivial cycle in .

This is really off-topic, but the post reminded me of something I wanted to ask you. Feel free to ignore.

Question: Are there any versions of the KKL Theorem or of Friedgut’s theorem for distributions that are not product distributions? Specifically, I’m considering distributions D where 1/2<=D(x)/D(y)<=2 for any x and y that differ only by 1 coordinate. I only need something that works for monotone functions.

(For those who don’t know what the hell I’m talking about: Friedgut’s theorem says that if the influence of a boolean function f is k then f can be epsilon-approximated by a function that just depends on c^{k/eps} variables for some universal constant c. It holds even for product distributions mu_p^n, in which case c depends on p).

I need this for an application in learning theory. The application should be obvious to anyone who knows the (as far as I know) one and only use of Friedgut’s theorem in Learning Theory, but due to the location being public, I won’t say more.

One thing I can contribute is that the most basic theorem (what Friedgut calls KKL1 in his paper about "KKL and BKKL") is _not_ true. Consider the majority function, and then choose the distribution to be symmetric (i.e. D(x) depends only on the weight of x) and let D be smallest in the middle two layers, and increasing as fast as possible to both sides. A simple calculation shows that the total influence is o(1) (in fact, exponentially small).
(here I’m assuming that we define the influence like Friedgut does for product spaces, i.e. the naive way, Inf_f(i)=Pr_D(f(x) \neq f(x \xor i)). This definition makes sense to me, but alternatives might work out as well).
It it easy to see that MAJ with this distribution is a balanced function (for odd n), so KKL1 is not true. QED.

So, KKL1 doesn’t work. However, I actually need Friendgut’s Theorem, or maybe KKL2 or something. And then the task is to find a function of a small number of variables that approximates f, and that seems like maybe it would be true. Furthermore, I only need it for monotone functions.

Friedgut-Dinur do have a variant of BKKKL that works for monotone functions, but I can’t see how to use it. (I tried defining g(x)=f(x)*D(x) and applying Friedgut-Dinur or any continuous type of KKL thingy, but that doesn’t seem to be useful.

Part of the problem in proving anything is that I can’t get any variant of Fourier Analysis to be relevant. Of course I can’t base Fourier Analysis on D, since I need a basis of orthogonal functions. So I need to define a product space in some way. I tried to define p_i as P(x_i=1) and just using the corresponding orthonormal basis, but that didn’t work for me. I tried other things as well, but nothing works. I’m skeptical that anything like this might work. (But then again, back in Poland I was already skeptical).

What I’d really love to have is a combinatorial proof of Friedgut’s theorem or of some (possibly weaker) variant of KKL that doesn’t use Bonami-Beckner or Fourier Analysis at all, but just combinatorial arguments and maybe some applications of Cauchy-Schwartz. I understand that such a thing is not known? Before I understood how KKL works, my intuition told me that proving KKL combinatorially should be possible (over uniform distrib, let’s say), since you can remove one variable with small influence very easily, by "squishing" f, i.e. setting f'(x)=(f(x)+f(x \xor i))/2. So you could conceivably just repeat this for each variable with small influence, and do something clever to put it all together and avoid having to pay the price for each coordinate separately. My intuition might have been too-simplistic, as usual. I understand now that that’s where you use Fourier Analysis and Bonami-Beckner (in a way), exactly in order to avoid being punished by each coordinate seperately. And I don’t know if you can do this combinatorially.

Gosh, this turned out much longer than I planned. I just wanted to ask you the question from the first paragraph: is there a version of KKL/Friedgut’s Theorem for some class of non-product distribs, or a version that doesn’t use Fourier Analysis? It’s enough for me if it works just for monotone functions. Feel free to ignore the question. If you’re interested in discussing it, we can talk by email (or here, this might be fun — that’s why I’m posting here, it seems useful to have it on google as well).

For d=2 I’m getting an answer of about 1.9542589152342531. You can get length 2 very easily by taking two perpendicular sides of the square. But you can do better by taking a Steiner tree of three of the four square vertices, say the points at (0,0), (0,1), and (1,0), with a Steiner vertex at (x,x) where x=sqrt(3)/(2+2sqrt(3)). The tree has a diagonal edge from (0,0) with length sqrt(2)x, and two longer edges to the other two vertices with length sqrt(x^2+(1-x)^2), giving the total as calculated above. It cuts the square torus into a non-regular hexagon with 120-degree vertices that tiles the plane with the translational symmetries of the square tiling.

I must have miscalculated somewhere, because the solution I was trying to describe is the same as the one depicted in spherical cows “Spherical Cubes and Rounding in High Dimensions” but the length is different.

Dear Elad,
I have to think more about what you asked. Usually when you leave product measures things are difficult. I am aware of two papers in this direction. There is a result for FKG distributions that may interest you: Influence and sharp-threshold theorems for monotonic measures, by Ben Graham and Geoffry Geimmett http://front.math.ucdavis.edu/0505.5057 There is also a paper by Olle Haggstrom, Elchanan Mossel and me called “A Law of Large Numbers for Weighted Majority” http://front.math.ucdavis.edu/0406.5509. Both these paper uses a different notion of influence which is simply the correlation between the value of f and the value of .

Thank you very much for your useful advice! Looks like I have some reading to do. On a shallow look, none of the results in the papers seem to exactly be what I need, but the Graham-Grimmett paper has a proof idea (Thm. 2.10) that I never tried, and your paper with Olle and Elchanan contains a bunch of useful warnings.

Your paper with Olle and Elchanan actually shows that weird things happen when we consider arbitrary distributions (or even FKG distributions) and consider “effect” instead of “influence”. In my setting, I can use your notion of “effect”, but as you show, bad things happen. So I prefer my own version of “influence” which I believe might lead to less catastrophes. Here it is: it’s the same thing as your “effect”, but for the case x_i=1, I “re-normalize” the contribution of f(x) by multiplying it by D(x xor i)/D(x). Thus, if variable i has no effect on f, then my concept of influence will = 0, which is not the case for your “effect”, and this makes my definition “more right” ;). Another benefit of this definition is that it can be well-estimated for monotone f if you;re given a random oracle to the function f and a “membership” oracle to the distribution D. This is good for my learning application, since I actually need to compute these things. If I didn’t care about being able to compute it, I could just define it in the classic way, of just choosing a random x according to D, and checking what’s the probability that f(x)~=f(x xor i). In fact, for my class of distributions, this is related to my definition by a multiplicative constant, I think.

Anyway, just wanted to point out that maybe the weakness you prove in your paper with Olle and Elchanan might come from a problematic definition of “effect”, where the effect of variable i might be non-zero even if it does not affect the value of the function.

But I’m doing something bad and talking about papers that i haven’t given enough attention to. I’ll keep reading and say if I managed to get something interesting. Please tell me if you remember anything else that might be relevant.

Thanks again Gil!!

P.S. on a side note, I really liked the explanation of Condorcet’s Jury Theorem in your paper. Someone should have told Feynman about it, and explained to him that rather than yelling about the textbook committee members and acting like a volcano, he should just convince them to use majority voting rather than averaging on the reviews that they receive. And then the committee should also majority-vote as well. Because, apparently, majority vote of almost-junk is not junk at all. This managed to surprise me since I never thought about it.

elad verbinHi Gil,

This is really off-topic, but the post reminded me of something I wanted to ask you. Feel free to ignore.

Question: Are there any versions of the KKL Theorem or of Friedgut’s theorem for distributions that are not product distributions? Specifically, I’m considering distributions D where 1/2<=D(x)/D(y)<=2 for any x and y that differ only by 1 coordinate. I only need something that works for monotone functions.

(For those who don’t know what the hell I’m talking about: Friedgut’s theorem says that if the influence of a boolean function f is k then f can be epsilon-approximated by a function that just depends on c^{k/eps} variables for some universal constant c. It holds even for product distributions mu_p^n, in which case c depends on p).

I need this for an application in learning theory. The application should be obvious to anyone who knows the (as far as I know) one and only use of Friedgut’s theorem in Learning Theory, but due to the location being public, I won’t say more.

One thing I can contribute is that the most basic theorem (what Friedgut calls KKL1 in his paper about "KKL and BKKL") is _not_ true. Consider the majority function, and then choose the distribution to be symmetric (i.e. D(x) depends only on the weight of x) and let D be smallest in the middle two layers, and increasing as fast as possible to both sides. A simple calculation shows that the total influence is o(1) (in fact, exponentially small).

(here I’m assuming that we define the influence like Friedgut does for product spaces, i.e. the naive way, Inf_f(i)=Pr_D(f(x) \neq f(x \xor i)). This definition makes sense to me, but alternatives might work out as well).

It it easy to see that MAJ with this distribution is a balanced function (for odd n), so KKL1 is not true. QED.

So, KKL1 doesn’t work. However, I actually need Friendgut’s Theorem, or maybe KKL2 or something. And then the task is to find a function of a small number of variables that approximates f, and that seems like maybe it would be true. Furthermore, I only need it for monotone functions.

Friedgut-Dinur do have a variant of BKKKL that works for monotone functions, but I can’t see how to use it. (I tried defining g(x)=f(x)*D(x) and applying Friedgut-Dinur or any continuous type of KKL thingy, but that doesn’t seem to be useful.

Part of the problem in proving anything is that I can’t get any variant of Fourier Analysis to be relevant. Of course I can’t base Fourier Analysis on D, since I need a basis of orthogonal functions. So I need to define a product space in some way. I tried to define p_i as P(x_i=1) and just using the corresponding orthonormal basis, but that didn’t work for me. I tried other things as well, but nothing works. I’m skeptical that anything like this might work. (But then again, back in Poland I was already skeptical).

What I’d really love to have is a combinatorial proof of Friedgut’s theorem or of some (possibly weaker) variant of KKL that doesn’t use Bonami-Beckner or Fourier Analysis at all, but just combinatorial arguments and maybe some applications of Cauchy-Schwartz. I understand that such a thing is not known? Before I understood how KKL works, my intuition told me that proving KKL combinatorially should be possible (over uniform distrib, let’s say), since you can remove one variable with small influence very easily, by "squishing" f, i.e. setting f'(x)=(f(x)+f(x \xor i))/2. So you could conceivably just repeat this for each variable with small influence, and do something clever to put it all together and avoid having to pay the price for each coordinate separately. My intuition might have been too-simplistic, as usual. I understand now that that’s where you use Fourier Analysis and Bonami-Beckner (in a way), exactly in order to avoid being punished by each coordinate seperately. And I don’t know if you can do this combinatorially.

Gosh, this turned out much longer than I planned. I just wanted to ask you the question from the first paragraph: is there a version of KKL/Friedgut’s Theorem for some class of non-product distribs, or a version that doesn’t use Fourier Analysis? It’s enough for me if it works just for monotone functions. Feel free to ignore the question. If you’re interested in discussing it, we can talk by email (or here, this might be fun — that’s why I’m posting here, it seems useful to have it on google as well).

D. EppsteinFor d=2 I’m getting an answer of about 1.9542589152342531. You can get length 2 very easily by taking two perpendicular sides of the square. But you can do better by taking a Steiner tree of three of the four square vertices, say the points at (0,0), (0,1), and (1,0), with a Steiner vertex at (x,x) where x=sqrt(3)/(2+2sqrt(3)). The tree has a diagonal edge from (0,0) with length sqrt(2)x, and two longer edges to the other two vertices with length sqrt(x^2+(1-x)^2), giving the total as calculated above. It cuts the square torus into a non-regular hexagon with 120-degree vertices that tiles the plane with the translational symmetries of the square tiling.

Ryan O'DGuy, Anup, Avi and I will describe a solution showing f(3) < 2.801 in the journal version of our paper… :)

D. EppsteinI must have miscalculated somewhere, because the solution I was trying to describe is the same as the one depicted in spherical cows “Spherical Cubes and Rounding in High Dimensions” but the length is different.

Gil KalaiDear Elad,

I have to think more about what you asked. Usually when you leave product measures things are difficult. I am aware of two papers in this direction. There is a result for FKG distributions that may interest you: Influence and sharp-threshold theorems for monotonic measures, by Ben Graham and Geoffry Geimmett http://front.math.ucdavis.edu/0505.5057 There is also a paper by Olle Haggstrom, Elchanan Mossel and me called “A Law of Large Numbers for Weighted Majority” http://front.math.ucdavis.edu/0406.5509. Both these paper uses a different notion of influence which is simply the correlation between the value of f and the value of .

elad verbinDear Gil,

Thank you very much for your useful advice! Looks like I have some reading to do. On a shallow look, none of the results in the papers seem to exactly be what I need, but the Graham-Grimmett paper has a proof idea (Thm. 2.10) that I never tried, and your paper with Olle and Elchanan contains a bunch of useful warnings.

Your paper with Olle and Elchanan actually shows that weird things happen when we consider arbitrary distributions (or even FKG distributions) and consider “effect” instead of “influence”. In my setting, I can use your notion of “effect”, but as you show, bad things happen. So I prefer my own version of “influence” which I believe might lead to less catastrophes. Here it is: it’s the same thing as your “effect”, but for the case x_i=1, I “re-normalize” the contribution of f(x) by multiplying it by D(x xor i)/D(x). Thus, if variable i has no effect on f, then my concept of influence will = 0, which is not the case for your “effect”, and this makes my definition “more right” ;). Another benefit of this definition is that it can be well-estimated for monotone f if you;re given a random oracle to the function f and a “membership” oracle to the distribution D. This is good for my learning application, since I actually need to compute these things. If I didn’t care about being able to compute it, I could just define it in the classic way, of just choosing a random x according to D, and checking what’s the probability that f(x)~=f(x xor i). In fact, for my class of distributions, this is related to my definition by a multiplicative constant, I think.

Anyway, just wanted to point out that maybe the weakness you prove in your paper with Olle and Elchanan might come from a problematic definition of “effect”, where the effect of variable i might be non-zero even if it does not affect the value of the function.

But I’m doing something bad and talking about papers that i haven’t given enough attention to. I’ll keep reading and say if I managed to get something interesting. Please tell me if you remember anything else that might be relevant.

Thanks again Gil!!

P.S. on a side note, I really liked the explanation of Condorcet’s Jury Theorem in your paper. Someone should have told Feynman about it, and explained to him that rather than yelling about the textbook committee members and acting like a volcano, he should just convince them to use majority voting rather than averaging on the reviews that they receive. And then the committee should also majority-vote as well. Because, apparently, majority vote of almost-junk is not junk at all. This managed to surprise me since I never thought about it.

Pingback: Download Jamie Foxx songs

Pingback: Answer to Test Your Intuition (3) « Combinatorics and more

Pingback: Test Your Intuition (4) « Combinatorics and more

Pingback: Answer To Test Your Intuition (4) « Combinatorics and more