The multinomial distribution

The multinomial distribution is an extension of the binomial distribution and has proven to be quite useful for me in my research. The multinomial distribution was used when I calculated the probability of juvenile fish to be present in four different areas along a shore. It was used together with maximum likelihood to calculate the probabilities based on how many fish I had observed in these areas.

The multinomial distribution deals with a variable that moves over a nominal scale. Just as the binomial distribution does. The difference is, however, that the multinomial distribution can handle several categories. The binomial distribution can only handle 2; Head or Tail, 0 or 1.

The equation of the multinomial distribution is:Multinomial

where P is the probability of observing x_1,…x_n counts for category 1,…n out of k trials when the probability of observing a count within each category is p_1,…p_n.

Now how can we use the multinomial equation?


We turn to the dice example in the probability theory section: A dice has six sides where the probability observing one of these sides in a toss is \frac{1}{6} = 0.17.

What is the probability of observing the side with 2 five times in five tosses, that is five times in a row. We sum everything in a table:


Putting this in the multinomial equation we get:



The probability is 0.0001, that is 0.01 percent.

How to do it in R

#Set the vector of outcomes for each category (side of the dice) in 5 tosses
#Set the number of tosses (5)
#Set the probability for each category
#Use the function for the multinomial distribution
dmultinom(x, k, p)

More in depth

The equation of the multinomial distribution resembles that of the binomial distribution but is slightly changed to incorporate more than two categories. To understand the link between the multinomial and binomial distribution, let’s calculate the probability of observing 2 Heads in 3 tosses using the multinomial distribution. Then we compare it with the results from using the binomial distribution:


We put this into the multinomial distribution equation:



That is the same as in the example using the binomial distribution. The numbers of combinations where 2 Heads are observed are 3 in both examples, as is the probability for each of these combinations. So if we get the same results, why do the equations differ? Well that is because we have to take into account several categories in the multinomial distribution. In the binomial distribution we only have to consider Head and Tail that has x number of outcomes and the probabilities p and q. So if x is known for any of these two outcomes, we can calculate the other; it is the same for the probabilities. So to save us from some sweat, the equation is slightly altered when dealing with only two outcomes/categories. The binomial distribution is simply a special case of the multinomial distribution.