Tukey test
The Tukey test is a way of comparing group means after an ANOVA, which has shown that there is a significant difference between any of the means.
You need to check the following assumptions before proceeding with this kind of Tukey test:
- Sample sizes (n) are equal
The goal of the Tukey test is to:
Compare the group means by testing a set of null-hypotheses that the difference between two means is zero. This is accomplished by calculating the T‑statistic, which here is used as a critical value to which all mean differences are compared:
The T‑statistic is calculated as:
where is a value found in a table of the Tukey distribution for
groups with degrees of freedom
(
),
is the within variance or Means Square Within and
is the sample size within each group.
Next, compare the T‑statistic to all possible mean differences.
Mean differences with four groups:

If any difference is larger than the T‑statistic, there is a significant difference between these two means, and the associated null-hypothesis should be rejected.
The Tukey test takes into account that several comparisons are made, which otherwise would have increased the risk of type 1 error. That means that we would have wrongly rejected the null-hypothesis in 5 times of 100.
Example
We used the same example as for the ANOVA. A company wants to find out if there is a difference in total sales between four geographical areas. There are 12 shops in each area, thus giving a total of 12 total sales per year (million dollars) for each area (Area 1‑Area 4).
The ANOVA found a significant difference between the means (F3,44 =87.42, P<0.05). Now we want to find out which means that differ. We therefore carry out a Tukey test.
1. Calculate the difference between the means:

2. Compute the T‑statistic:
3. Compare the differences with the T‑statistic:
The only differences that exceeds the T‑statistic are ,
and
. That is those means that are compared to the groups Area 3. The mean of Area 3 is larger than all others.
3. Interpret the result:
We are 95 % certain that the mean of Area 3 is larger than the means of the other Areas.
How to do it in R
#Import the data data2<-read.csv("http://www.ilovestats.org/wp-content/uploads/2015/07/Example_data.csv",dec=",",sep=";") #Tukey test TukeyHSD(aov(Sales~Area,data=data2)) plot(TukeyHSD(aov(Sales~Area,data=data2)))