You have a box with n red balls and n blue balls. You take out balls one by one at random until left only with balls of the same color. How many balls will be left (as a function of n)?

1) Roughly  εn for some ε>0.

2) Roughly $\sqrt n$?

3) Roughly log n?

4) Roughly a constant?

Here is the collective intuition regarding this problem

If you have two different boxes A with n red balls, and B with n blue balls and you take out at random with equal probabilities a ball from box A or a ball from box B, then when only one color is left the number of balls left is roughly $\sqrt n$.  But in our case, when there are more red balls left it is more likely that we take out a red ball. Precisely, if we think about our problem but leaves the red and blue balls in different boxes,  when there are a balls left in box A, and b balls left in box B you choose to take out another box from box A with probability a/(a+b). This seems to push the number of balls of each color closer together but by how much?

You can look at your drawing process as a random permutation of n red balls and n blue balls. We ask what is the distribution of the number of last balls of the same color. This is the same as the distribution D of the number of first balls of the same color. The first balls come at random with probability for red/blue very close to 1/2. This gives that the expected number of balls of the same color initial for the permutation is a constant, roughly 2, and that D is concentrated on constant numbers.

There were many interesting comments to this problem. Thanks for all the commentators! When Itai and Ronen asked me this question my spontaneous guess was wrong.

This entry was posted in Probability, Test your intuition and tagged , , , , . Bookmark the permalink.

1. oz says:

There is an interesting (and more difficult) variation I once heard but can’t recall where:

You take out each time a ball at random as before. But, if the ball was red, you put it back in the box and take out a blue ball. If the ball was blue, you put it back in the box and take out a red ball.
You keep as before until left only with balls of the same color. How many balls will be left (as a function of n)?

2. igorpak says:

Dear Gil, Here is another way to think about this. Denote by T a random plane tree on n vertices. What’s the expected degree of the root? The answer is – it tends to 3, and for the same reason your number of remaining balls in your example tends to 2. The connection might be surprising and not very obvious, but here is a sketch.

You map your ball drawing process as a path on a grid starting at the origin. Red ball would correspond to upsteps (1,1) and blue to downsteps (1,-1). Reflect the resulting path so it’s above y=0 line. These are called Dyck paths. Not much is lost when you do the reflection, and you are looking at the length of the final descenting path DP.

Recall that Dyck paths are in bijection with plane trees. Basically, use the DFS on a tree (walk from left to right) and every step away from the root correspond to upstep, towards the root – downstep. Now, your DP is mapped into the length of the path of rightmost edges. Convert the plane tree into a binary tree via a standard bijection, like here: http://bit.ly/17pPdSn Again, you get a rightmost path, just one step shorter. Reflect the binary tree left to right, and map it back to a plane tree. You get a plane tree T whose degree of the root is distributed as DP+1. Done!

For the reference and precise formula on the root distribution, see Flajolet-Sedgewick’s fantastic book http://algo.inria.fr/flajolet/Publications/books.html (see Ex 5, p.141). This limit also recently came up in my own work on random Catalan objects: http://www.math.ucla.edu/~pak/papers/PermShape7.pdf (see Thm 6.1).