Statistical Test in Action – Pepsi or Cola

A student of Economic Analysis, Jan, volunteered when I asked if anyone could tell Pepsi from Coca-Cola by taste. So, we decided to conduct a test with the whole class during the statistics lecture. Eliana, a student who volunteered to help with the experiment, would toss a coin behind the door and, depending on the result, she would pour Pepsi or Coke into a cup; another student, Maria, would bring the cup to the lecture hall, not knowing its contents; Jan would try and guess what drink was in the cup. We found that the test was successful – Jan showed that he could tell which of the two beverages he got with better accuracy than we would expect based on the assumption of randomness: in 13 attempts, Jan guessed correctly 11 times.

The test also went surprisingly well from a didactic perspective. In this example we were able to discuss typical elements of a statistical test.

mem

Null hypothesis. In this case, the null hypothesis was that John cannot tell the difference between Coke and Pepsi, which means that the proportion in the “population” (in this case, we would rather talk about the process generating the data) is 1/2 (which corresponds to complete randomness):

$$H_0: p = 0.5$$

Alternative hypothesis. Nhe natural one, in this case, was the right-tail hypothesis that the probability of guessing in a single attempt is greater than 1/2:

$$H_A: p > 0.5$$

Test statistic. The test statistic in this case was simply the number of successes (correct guesses).

Rejection interval.. Before conducting the experiment, we determined that we would accept Jan’s skills in this regard if he correctly guessed ten or more times in 13 attempts. We were guided in this by the significance level (see below).

Significance level ($\alpha$). The significance level is the probability of rejecting the null hypothesis when it is true (the probability of a type I error). In our case, one can determine the significance level from the binomial distribution:

$$\begin{split} X\sim \text{Binom}(n=13, p=0.5) \\ \alpha = P(X \ge 10) = 0.0461\end{split} $$

We can calculate the significance level in R:

1-pbinom(9, 13, .5)
## [1] 0.04614258

During the class, we used an appropriate spreadsheet template.

Jan guessed 11 times. The test probability ( p-value ), which is the probability of obtaining such or more extreme result, in a situation where the null hypothesis is true, can also be determined from the same binomial distribution:

P-value. Mr. Jan guessed 11 times. The test probability (p-value) is the probability of obtaining such or more extreme result given the null hypothesis is true. It can also be determined from the same binomial distribution:

$$P(X \ge 11) = 0,0112$$
1-pbinom(10, 13, .5)
## [1] 0.01123047

Power of the test. Jan told the class that – judging based on the tests he did at home the previous day – one could assume his long term accuracy is about 85%. With this assumption, we can determine the power of the test performed during the lecture. The power of the test is $1 - \beta$ where $\beta$ (beta) is the probability that we will not reject the null hypothesis given the underlying probability of success (the proportion in the process generating the data) is 0.85.

$$\begin{split} Y\sim \text{Binom}(n=13, p=0.85) \\ 1 - \beta = 1 - P(X < 10) = P(X \ge 10) = 0,882\end{split} $$
1-pbinom(9, 13, .85)
## [1] 0.8819973

The power of the test with the above assumptions is 88.2%.

How do we do this in R? The test we performed is not typically presented in the statistics course at our department. It is an exact binomial test. In R, the appropriate calculations can be performed using the binom.test.

binom.test(x=11, n=13, alternative="greater")
## 
## 	Exact binomial test
## 
## data:  11 and 13
## number of successes = 11, number of trials = 13, p-value = 0.01123
## alternative hypothesis: true probability of success is greater than 0.5
## 95 percent confidence interval:
##  0.5899014 1.0000000
## sample estimates:
## probability of success 
##              0.8461538

For posts on R from other bloggers, see R-bloggers.

Błażej Kochański
Błażej Kochański
Banking Risk Expert, Researcher and Management Consultant