Mann-Whitney U-test

The Mann-Whit­ney U-test is a non-para­met­ric test that is used to com­pare the medi­ans of two pop­u­la­tions. You test the null-hypoth­e­sis that M1 = M2. Con­trary to para­met­ric tests, the Mann-Whit­ney U-test do not use the orig­i­nal val­ues in the cal­cu­la­tion of the test sta­tis­tic but rather the ranks of the val­ues.

You need to check the fol­low­ing assump­tions before pro­ceed­ing with the t-test:

  1. The obser­va­tions are inde­pen­dent

Before pro­ceed­ing with the test cal­cu­la­tion of the sta­tis­tic, the orig­i­nal val­ues must be con­vert­ed to ranks.

The Mann-Whit­ney U-test relies on the test sta­tis­tic U1 and U2, which are cal­cu­lat­ed by:

U_1 = n_1n_2 + \frac{n_2(n_2+1}){ 2} - R_2

U_2 = n_1n_2 + \frac{n_1(n_1+1}){ 2} - R_2

where n1 and n2 are the sam­ple sizes of sam­ple 1 and 2, respec­tive­ly, and R1 and R2 are the sum of the ranks of each sam­ple. The small­est val­ue of U1 and U2 is select­ed as the test sta­tis­tic to be com­pared with the crit­i­cal val­ue that is found in a table.

If your cal­cu­la­tions are right U1 + U2 = n1n2

If the cal­cu­lat­ed test sta­tis­tic is less than the crit­i­cal, the null-hypoth­e­sis is reject­ed.

Exam­ple

You want to see if the den­si­ty of a spe­cif­ic plant species dif­fers between the two habi­tats for­est and mead­ow.

1. Con­struct the null-hypoth­e­sis

H0: The medi­an num­ber of plants per m2 is the same with­in a for­est and a mead­ow (M_{FOREST} = M_{MEADOW} )

2. Do the exper­i­ment

You go out and count the num­ber of plants per m2 in a num­ber of frames in each type of habi­tat.

3. Sort the sam­ples from each habi­tat and cal­cu­late the medi­an of each sam­ple, in this case:

4. Rank the obser­va­tions:

5. Sum the ranks for each sam­ple

RFOREST = 57.5

RMEADOW = 113.5
6. Cal­cu­late the test sta­tis­tics UFOREST and UMEADOW

UFOREST = 12.5

UMEADOW = 68.5

7. UFOREST + UMEADOW = nFOREST x nMEADOW = 8
8. Look up the crit­i­cal val­ue for U at α = 0.05

We check the t-table at n = 9 where α = 0.05. There we find that U = 17
9. Com­pare the cal­cu­lat­ed U sta­tis­tic with Uα=0.05

We always choose the small­est U sta­tis­tic from our cal­cu­la­tions, which UFOREST = 12.5

U < Uα=0.05 = 12.5 < 17
10. Reject H0 or H1

H0 can be reject­ed; the medi­an den­si­ty between for­est and mead­ow dif­fer (MFOREST ≠ MMEADOW)
11. Inter­pret the result

We are more than 99 % cer­tain that the den­si­ty of plants is high­er with­in the mead­ow com­pared to the for­est.

How to do it in R

#1. Import the data

	data<-read.csv("http://www.ilovestats.org/wp-content/uploads/2015/08/U-test1.csv",dec=",",sep=";")

#2. Run the test
	wilcox.test(data$Forest, data$Meadow)