# Test Your Intuition (15): Which Experiment is More Convincing

Consider the following  two scenarios

(1) An experiment tests the effect of a new medicine on people which have a certain illness. The conclusion of the experiment is that for 5% of the people tested the medication led to improvement while for 95% it had no effect. (The experiment followed all rules: it had a control test it was double blind etc. …)

A statistical analysis concluded that the results are statistically significant, where the required statistical significance level is 1%. This roughly means that the probability $p_1$  that such an effect happend by chance (under the “null hypothesis”) is less or equal 0.01. (This probability is called the $p$-value. Suppose that $p_1$= 0.008.)

(2) An experiment tests the effect of a new medicine on people which have a certain illness. The conclusion of the experiment is that for 30% of the people tested the medication led to improvement while for 70% it had no effect. (The experiment followed all rules: it had a control test it was double blind etc. …)

A statistical analysis concluded that the results are statistically significant, where the required statistical significance level is 1%. (Again, this roughly means that the probability $p_2$  that such an effect happend by chance (under the “null hypothesis”) is less or equal 0.01. And again suppose that $p_2$= 0.008.)

Test your intuition: In which of these two scenarios it is more likely that the effect of the medication is real.

You can assume that the experiments are identical in all other terms that may effect your answer. E.g., the theoretical explanation for the effect of the medicine. Note that our assumption $p_1=p_2$ is likely to imply that the sample size for the first experiment is larger.

Two experimental results of 10/100 and 15/100 are not equivalent to one experiment with outcomes 3/200.

(Here is a link to the original post.)

One way to see it is to think about 100 experiments. The outcomes under the null hypothesis will be 100 numbers (more or less) uniformly distributed in [0,1]. So the product is extremely tiny.

What we have to compute is the probability that the product of two random numbers uniformly distributed in [0,1] is smaller or equal 0.015. This probability is much larger than 0.015.

Here is a useful approximation (I thank Brendan McKay for reminding me): if we have $n$ independent values in $U(0,1)$  then the prob of product $< X$ is

$X \sum_{i=0}^{n-1} ( (-1)^i (log X)^i/i!.$

In this case  0.015 * ( 1 – log(0.015) ) = 0.078

So the outcomes of the two experiments do not show a significant support for the theory.

The theory of hypothesis testing in statistics is quite fascinating, and of course, it became a principal tool in science and  led to major scientific revolutions. One interesting aspect is the similarity between the notion of statistical proof which is important all over science and the notion of interactive proof in computer science. Unlike mathematical proofs, statistical proof are based on following certain protocols and standing alone if you cannot guarantee that the protocol was followed the proof has little value.