## Maximum likelihood

Max­i­mum like­li­hood (ML), accord­ing to my opin­ion, one of the most fun parts of sta­tis­tics. It is also a very impor­tant con­cept that sets the ground for more advanced tests such as using Gen­er­al­ized Lin­ear mod­els.

ML and the bino­mi­al dis­tri­b­u­tion

Remem­ber the Heads and Tails exam­ple from the bino­mi­al dis­tri­b­u­tion sec­tion? The equa­tion for the  bino­mi­al dis­tri­b­u­tion can be used in the oth­er direc­tion in order to cal­cu­late the like­li­hood that the coin is fair, i.e. p = 0.5. You toss a coin 10 times and get the result: 2 Heads and 8 Tails. Is the coin fair?

We repeat the equa­tion for the bino­mi­al dis­tri­b­u­tion:

where $P$ is the prob­a­bil­i­ty of get­ting for exam­ple $x$ num­ber of heads on $k$ tri­als, $p$ is the prob­a­bil­i­ty of get­ting heads on each tri­al and $q$ is the prob­a­bil­i­ty of get­ting a tail.

Here you ask: “I know that the prob­a­bil­i­ty of get­ting a Head with my coin is 0.5, what is the prob­a­bil­i­ty of get­ting 2 Heads in 10 toss­es?”

But deal­ing with max­i­mum like­li­hood you turn every­thing around and ask instead: “I got 2 Heads in 10 toss­es, what is the like­li­hood that the prob­a­bil­i­ty get­ting a Head with my coin is 0.5?” or “What is the prob­a­bil­i­ty of get­ting a Head with my coin real­ly?”

To answer this ques­tion we change the sym­bols in the left hand of the equa­tion slight­ly (every­thing else is the same):

This equa­tion says: What is the like­li­hood ($L$ ) that the prob­a­bil­i­ty of an out­come ($p$)  is a spe­cif­ic val­ue between 0 and 1 (0.2,0.5 or 0.8 for exam­ple) when we have observed $x$ out­comes on $k$ tri­als?

We put in the val­ue of the num­ber of Heads we observed in 10 tri­als. We want to find out if p = q = 0.5:

The like­li­hood is less than 5 %. So it is very unlike­ly to observe 2 Heads on 10 toss­es when the prob­a­bil­i­ty of get­ting a Head in each toss is 0.5. But, what is the prob­a­bil­i­ty of get­ting a Head with this coin? Well, we can cal­cu­late the like­li­hood with the same val­ues for  $x$  and  $k$, but chang­ing the val­ue for $p$. The  $p$ with the high­est like­li­hood is the most like­ly prob­a­bil­i­ty of get­ting a Head with this coin. If work­ing out the like­li­hood with $x$ = 2 and  $k$  = 10 for a range of $p$ val­ues we get:

So the most like­ly p val­ue for this coin is 0.2. This pro­ce­dure is called max­i­mum like­li­hood. This method can be used for any dis­tri­b­u­tion to esti­mate the val­ue of the para­me­ters in the equation/model. Max­i­mum like­li­hood is used in Gen­er­al­ized Lin­ear mod­els.