The multinomial distribution

The multi­n­o­mi­al dis­tri­b­u­tion is an exten­sion of the bino­mi­al dis­tri­b­u­tion and has proven to be quite use­ful for me in my research. The multi­n­o­mi­al dis­tri­b­u­tion was used when I cal­cu­lat­ed the prob­a­bil­i­ty of juve­nile fish to be present in four dif­fer­ent areas along a shore. It was used togeth­er with max­i­mum like­li­hood to cal­cu­late the prob­a­bil­i­ties based on how many fish I had observed in these areas.

The multi­n­o­mi­al dis­tri­b­u­tion deals with a vari­able that moves over a nom­i­nal scale. Just as the bino­mi­al dis­tri­b­u­tion does. The dif­fer­ence is, how­ev­er, that the multi­n­o­mi­al dis­tri­b­u­tion can han­dle sev­er­al cat­e­gories. The bino­mi­al dis­tri­b­u­tion can only han­dle 2; Head or Tail, 0 or 1.

The equa­tion of the multi­n­o­mi­al dis­tri­b­u­tion is:Multinomial

where P is the prob­a­bil­i­ty of observ­ing x_1,...x_n counts for cat­e­go­ry 1,...n out of k tri­als when the prob­a­bil­i­ty of observ­ing a count with­in each cat­e­go­ry is p_1,...p_n.

Now how can we use the multi­n­o­mi­al equa­tion?


We turn to the dice exam­ple in the prob­a­bil­i­ty the­o­ry sec­tion: A dice has six sides where the prob­a­bil­i­ty observ­ing one of these sides in a toss is \frac{1}{6} = 0.17.

What is the prob­a­bil­i­ty of observ­ing the side with 2 five times in five toss­es, that is five times in a row. We sum every­thing in a table:

Putting this in the multi­n­o­mi­al equa­tion we get:



The prob­a­bil­i­ty is 0.0001, that is 0.01 per­cent.

How to do it in R

#Set the vector of outcomes for each category (side of the dice) in 5 tosses
#Set the number of tosses (5)
#Set the probability for each category
#Use the function for the multinomial distribution
dmultinom(x, k, p)

More in depth

The equa­tion of the multi­n­o­mi­al dis­tri­b­u­tion resem­bles that of the bino­mi­al dis­tri­b­u­tion but is slight­ly changed to incor­po­rate more than two cat­e­gories. To under­stand the link between the multi­n­o­mi­al and bino­mi­al dis­tri­b­u­tion, let’s cal­cu­late the prob­a­bil­i­ty of observ­ing 2 Heads in 3 toss­es using the multi­n­o­mi­al dis­tri­b­u­tion. Then we com­pare it with the results from using the bino­mi­al dis­tri­b­u­tion:

We put this into the multi­n­o­mi­al dis­tri­b­u­tion equa­tion:



That is the same as in the exam­ple using the bino­mi­al dis­tri­b­u­tion. The num­bers of com­bi­na­tions where 2 Heads are observed are 3 in both exam­ples, as is the prob­a­bil­i­ty for each of these com­bi­na­tions. So if we get the same results, why do the equa­tions dif­fer? Well that is because we have to take into account sev­er­al cat­e­gories in the multi­n­o­mi­al dis­tri­b­u­tion. In the bino­mi­al dis­tri­b­u­tion we only have to con­sid­er Head and Tail that has x num­ber of out­comes and the prob­a­bil­i­ties p and q. So if x is known for any of these two out­comes, we can cal­cu­late the oth­er; it is the same for the prob­a­bil­i­ties. So to save us from some sweat, the equa­tion is slight­ly altered when deal­ing with only two outcomes/categories. The bino­mi­al dis­tri­b­u­tion is sim­ply a spe­cial case of the multi­n­o­mi­al dis­tri­b­u­tion.