Maximum likelihood

Max­i­mum like­li­hood (ML), accord­ing to my opin­ion, one of the most fun parts of sta­tis­tics. It is also a very impor­tant con­cept that sets the ground for more advanced tests such as using Gen­er­al­ized Lin­ear mod­els.

ML and the bino­mi­al dis­tri­b­u­tion

Remem­ber the Heads and Tails exam­ple from the bino­mi­al dis­tri­b­u­tion sec­tion? The equa­tion for the  bino­mi­al dis­tri­b­u­tion can be used in the oth­er direc­tion in order to cal­cu­late the like­li­hood that the coin is fair, i.e. p = 0.5. You toss a coin 10 times and get the result: 2 Heads and 8 Tails. Is the coin fair?

We repeat the equa­tion for the bino­mi­al dis­tri­b­u­tion:

Binomial equation

where P is the prob­a­bil­i­ty of get­ting for exam­ple x num­ber of heads on k tri­als, p is the prob­a­bil­i­ty of get­ting heads on each tri­al and q is the prob­a­bil­i­ty of get­ting a tail.

Here you ask: “I know that the prob­a­bil­i­ty of get­ting a Head with my coin is 0.5, what is the prob­a­bil­i­ty of get­ting 2 Heads in 10 toss­es?”

But deal­ing with max­i­mum like­li­hood you turn every­thing around and ask instead: “I got 2 Heads in 10 toss­es, what is the like­li­hood that the prob­a­bil­i­ty get­ting a Head with my coin is 0.5?” or “What is the prob­a­bil­i­ty of get­ting a Head with my coin real­ly?”

To answer this ques­tion we change the sym­bols in the left hand of the equa­tion slight­ly (every­thing else is the same):

ML_binomial

This equa­tion says: What is the like­li­hood (L ) that the prob­a­bil­i­ty of an out­come (p)  is a spe­cif­ic val­ue between 0 and 1 (0.2,0.5 or 0.8 for exam­ple) when we have observed x out­comes on k tri­als?

We put in the val­ue of the num­ber of Heads we observed in 10 tri­als. We want to find out if p = q = 0.5:

ML_binomial_example

The like­li­hood is less than 5 %. So it is very unlike­ly to observe 2 Heads on 10 toss­es when the prob­a­bil­i­ty of get­ting a Head in each toss is 0.5. But, what is the prob­a­bil­i­ty of get­ting a Head with this coin? Well, we can cal­cu­late the like­li­hood with the same val­ues for  x  and  k, but chang­ing the val­ue for p. The  p with the high­est like­li­hood is the most like­ly prob­a­bil­i­ty of get­ting a Head with this coin. If work­ing out the like­li­hood with x = 2 and  k  = 10 for a range of p val­ues we get:

So the most like­ly p val­ue for this coin is 0.2. This pro­ce­dure is called max­i­mum like­li­hood. This method can be used for any dis­tri­b­u­tion to esti­mate the val­ue of the para­me­ters in the equation/model. Max­i­mum like­li­hood is used in Gen­er­al­ized Lin­ear mod­els.