The standard error

I believe the stan­dard error is one of the most con­fus­ing con­cepts for those that are new in sta­tis­tics. That is my expe­ri­ence any­way as teach­ing stu­dents in basic sta­tis­tics. That is because it is very easy to mix it up with the stan­dard devi­a­tion. It is under­stand­able, since the stan­dard error is a type of stan­dard devi­a­tion.

Sim­ply speak­ing, the stan­dard error is a mea­sure of how good you have esti­mat­ed a pop­u­la­tion para­me­ter such as the mean. A small val­ue means that an esti­mate is of high pre­ci­sion, and a large val­ue means that an esti­mate is of low pre­ci­sion. The stan­dard devi­a­tion on the oth­er hand is a mea­sure of vari­abil­i­ty with­in the population/sample. Besides pro­vid­ing an indi­ca­tion of how good your esti­mate is, the stan­dard error is used to pro­duce a con­fi­dence inter­val. The true esti­mate is found with­in that inter­val with a spe­cif­ic prob­a­bil­i­ty (usu­al­ly 0.95).

Use this equa­tion to cal­cu­late the stan­dard error of a mean:

standard error

where SE is the stan­dard error, s is the stan­dard devi­a­tion of a sam­ple and n is the num­ber of observations/units with­in the sam­ple.

As you can see the stan­dard error is a func­tion of sam­ple size (n); larg­er sam­ples results in small­er stan­dard errors and thus high­er pre­ci­sion of the mean. Its log­i­cal. When get­ting a larg­er set of obser­va­tions from a pop­u­la­tion, the more like­ly you are to come in close range of the true mean. Take a look at the plot below where the stan­dard devi­a­tion remains the same, but the num­ber of obser­va­tions (n) varies:

Exam­ple

Cal­cu­late the stan­dard error of a mean where s = 0.2 and the total num­ber of obser­va­tions (n) is 10.

Use the equa­tion for the stan­dard error of a mean:

SE = \frac{0.2}{\sqrt{10}} = 0.06

Answer: The stan­dard error is 0.06

More in depth

Cen­tral lim­it the­o­rem

Ok, now let’s con­fuse things. I want you to tru­ly under­stand the the­o­ry behind the stan­dard error and what it real­ly is.

Think about the fol­low­ing: If you sam­ple a pop­u­la­tion sev­er­al times and cal­cu­late the mean every time. Do you think you get the same esti­mate every time? No, you will not. They will vary due to some­thing called sam­pling error. That is not error on your behalf but error due to chance. By chance you will get dif­fer­ent esti­mates of the true mean since you take sam­ples from the pop­u­la­tion. Now I’ll intro­duce an inter­est­ing phe­nom­e­non; all these means you get from tak­ing lots and lots of sam­ples from a pop­u­la­tion con­forms to a nor­mal dis­tri­b­u­tion. This is called the Cen­tral Lim­it The­o­rem.

Say you sam­ple a pop­u­la­tion, (N=1000) with µ = 50 and σ = 2, a mil­lion times and esti­mate the mean every sin­gle time. The sam­ple size is n = 20 in every sam­ple. When you plot the dis­tri­b­u­tion you get:

The esti­mat­ed means (\overline{x}_n) are cen­tered around the true mean (\mu) of the pop­u­la­tion that we have drawn the sam­ples from. The nor­mal dis­tri­b­u­tion above con­tains prac­ti­cal­ly all pos­si­ble esti­mates of the mean one can get with a sam­ple with n=20 since you draw an infi­nite num­ber of sam­ples (almost).

What hap­pens if we alter the sam­ple size? Con­sid­er the plots below where I have sam­pled the same pop­u­la­tion again, but with dif­fer­ent sam­ple sizes (n) in each plot:

Notice that the stan­dard devi­a­tion (sdMEANS) decrease with increas­ing sam­ple size. In oth­er words, the vari­abil­i­ty with­in the pop­u­la­tion of esti­mat­ed means decreas­es with high­er n. In the exam­ple above, an esti­mat­ed mean from a sam­ple with n=40 will nev­er be less or high­er than about 49 and 51, respec­tive­ly.  That is quite a good pre­ci­sion. On the con­trary, an esti­mate can vary between 47 and 53, respec­tive­ly, when n=5. So, you get an esti­mate of high­er pre­ci­sion with high­er sam­ple size. That leads us to the stan­dard error:

The stan­dard devi­a­tion of the pop­u­la­tion of esti­mat­ed means is the stan­dard error. As the sam­ple size increase, the stan­dard error decrease, i.e. the mean is esti­mat­ed with high­er pre­ci­sion.

You don’t have to draw an infi­nite num­ber of sam­ples from a pop­u­la­tion to cal­cu­late stan­dard error. Just use the equa­tion for the stan­dard error as giv­en in the begin­ning of this arti­cle.

How to produce the graphs in this article in R

The cen­tral lim­it the­o­rem sin­gle graph


#Function for the plot

	H_fun<-function(m,s,n,N,b){  ##m=µ, s= s, n=sample size, N= pop size b=nr of bootstraps


		x<-rnorm(N, m, s)

 		p <- matrix(ncol=b,nrow=n)
    		for (i in 1:b) {
    			p[,i] <- sample(x, n, replace = T)
		}

		a<-apply(p, MARGIN=2, FUN = mean)

		hist(a, breaks = "Sturges",
    			freq = FALSE,  col = "#C20000", main = NULL,
    			ylim = c(0,1),xlim= c(47,53), xlab = "Estimated Means", ylab="Probability",bty="l", las=1, xaxt="n",cex.lab=1.2)

		axis(side=1,at=c(seq(43,57,2)),labels = c(seq(43,57,2)),pos=0,las=1,tick=T)
		text(49.5,1.3,"n =",cex=1.2)
		text(50,1.3,n, cex=1.2)

	}

#Using the function

	H_fun(50,2,20,1000,1000000)

The cen­tral lim­it the­o­rem mul­ti­ple graphs

#Function for a plot

	H_fun<-function(m,s,n,N,b){  ##m=µ, s= s, n=sample size, N= pop size b=nr of bootstraps


		x<-rnorm(N, m, s)

 		p <- matrix(ncol=b,nrow=n)
    		for (i in 1:b) {
    			p[,i] <- sample(x, n, replace = T)
		}

		a<-apply(p, MARGIN=2, FUN = mean)

		SE<-round(sd(a),2)

		hist(a, breaks = "Sturges",
    			freq = FALSE,  col = "#C20000", main = bquote("n ="~.(n)~"|"~sd[MEANS]~"="~.(SE)),
    			ylim = c(0,1),xlim= c(47,53), xlab = "Estimated Means", ylab="Probability",bty="l", las=1, xaxt="n",cex.lab=1.2)

		axis(side=1,at=c(seq(43,57,2)),labels = c(seq(43,57,2)),pos=0,las=1,tick=T)
		text(49.5,1.3,"n =",cex=1.2)
		text(50,1.3,n, cex=1.2)

	}

#Plotting

	par(mfcol=c(2,2))

		H_fun(50,2,5,1000,1000000)
		H_fun(50,2,10,1000,1000000)
		H_fun(50,2,20,1000,1000000)
		H_fun(50,2,40,1000,1000000)