Maximum likelihood (ML), according to my opinion, one of the most fun parts of statistics. It is also a very important concept that sets the ground for more advanced tests such as using Generalized Linear models.
ML and the binomial distribution
Remember the Heads and Tails example from the binomial distribution section? The equation for the binomial distribution can be used in the other direction in order to calculate the likelihood that the coin is fair, i.e. p = 0.5. You toss a coin 10 times and get the result: 2 Heads and 8 Tails. Is the coin fair?
We repeat the equation for the binomial distribution:
where is the probability of getting for example number of heads on trials, is the probability of getting heads on each trial and is the probability of getting a tail.
Here you ask: “I know that the probability of getting a Head with my coin is 0.5, what is the probability of getting 2 Heads in 10 tosses?”
But dealing with maximum likelihood you turn everything around and ask instead: “I got 2 Heads in 10 tosses, what is the likelihood that the probability getting a Head with my coin is 0.5?” or “What is the probability of getting a Head with my coin really?”
To answer this question we change the symbols in the left hand of the equation slightly (everything else is the same):
This equation says: What is the likelihood ( ) that the probability of an outcome () is a specific value between 0 and 1 (0.2,0.5 or 0.8 for example) when we have observed outcomes on trials?
We put in the value of the number of Heads we observed in 10 trials. We want to find out if p = q = 0.5:
The likelihood is less than 5 %. So it is very unlikely to observe 2 Heads on 10 tosses when the probability of getting a Head in each toss is 0.5. But, what is the probability of getting a Head with this coin? Well, we can calculate the likelihood with the same values for and , but changing the value for . The with the highest likelihood is the most likely probability of getting a Head with this coin. If working out the likelihood with = 2 and = 10 for a range of values we get:
So the most likely p value for this coin is 0.2. This procedure is called maximum likelihood. This method can be used for any distribution to estimate the value of the parameters in the equation/model. Maximum likelihood is used in Generalized Linear models.