[porto_container animation_duration=“1000” animation_delay=“0”][/porto_container]

The normal distribution

The nor­mal dis­tri­b­u­tion dis­play spe­cif­ic char­ac­ter­is­tics that set the basis for para­met­ric test. In the end it comes down to the pos­si­bil­i­ty to cal­cu­late the stan­dard error. If you under­stand how the mean and stan­dard error works togeth­er with­in the nor­mal curve, you are on your way to real­ly under­stand the the­o­ry behind hypoth­e­sis test­ing using para­met­ric tests.

Math­e­mat­i­cal characteristics

The nor­mal curve is sym­met­ric cen­ter­ing around the mean. The val­ue of the mean, medi­an and mode is exact­ly the same in a per­fect nor­mal dis­tri­b­u­tion because of this sym­me­try. The nor­mal dis­tri­b­u­tion pos­sess­es math­e­mat­i­cal prop­er­ties that are very use­ful in hypoth­e­sis test­ing using para­met­ric tests:

(1) 68 % of all val­ues in the pop­u­la­tion or sam­ple falls with­in 1 stan­dard devi­a­tion from the mean. That means that if the mean is 35 and the stan­dard devi­a­tion is 2, you’ll find 68 % of the val­ues of the dis­tri­b­u­tion with­in the inter­val 35 ± 2. In oth­er words between 33 and 37.

(2) About 95 % of the val­ues with­in the dis­tri­b­u­tion are found with­in 1.96 stan­dard devi­a­tions from the mean. That is, you’ll find 95 % of the val­ues with­in the inter­val 35 ± 1.96 × 2 = 35 ± 3.92 = 31.08 to 38.92.

(3) About 99 % of the val­ues are found 2.58 stan­dard devi­a­tions from the mean. That is, you’ll find 99 % of the val­ues with­in the inter­val 35 ± 2.58 × 2 = 35 ± 5.16 = 29.84 to 40.16.

(4) Almost the entire pop­u­la­tion (99.7 %) lies with­in the dis­tance of 3 stan­dard devi­a­tions from the mean. I guess you know how to cal­cu­late this inter­val by now.

What this says is that the prob­a­bil­i­ty that a val­ue of the nor­mal dis­trib­uted vari­able belongs to the pop­u­la­tion is 0.05 if it is found 1.96  stan­dard devi­a­tions from the mean. And 0.01 if it is found 2.58 stan­dard devi­a­tions from the mean. It also says that only a frac­tion of 0.05 and 0.01 of the val­ues in the nor­mal dis­tri­b­u­tion are found 1.96 and 2.58 stan­dard devi­a­tions from the mean respectively.

Do you start to grasp in what way this has to do with sta­tis­ti­cal tests? If not, you’ll get it soon enough. To real­ly get the whole pic­ture you need to read and under­stand the sec­tion about stan­dard error and con­fi­dence intervals.

Impor­tant to remember

(1) The nor­mal dis­tri­b­u­tion is sym­met­ric and pos­sess­es spe­cif­ic math­e­mat­i­cal prop­er­ties that enable you to cal­cu­late the prob­a­bil­i­ty that a spe­cif­ic val­ue belongs to the population.

(2) Under­stand­ing the prop­er­ties of the nor­mal dis­tri­b­u­tion is the first step in under­stand­ing the the­o­ry behind para­met­ric tests.

R code for the nor­mal dis­tri­b­u­tion graph


		mnorm<-function(my, sigma,x){


	for(i in 1:length(x)) {


	plot(x,p,ylab="",xlab="Z-Score = Nr of sd from mean",type="n",las=1,bty="l",pch=19,yaxt="n",xaxt="n")
	axis(side=1,at=c(K.L,my,K.U),labels = c(-1.96,0,1.96),pos=0,las=1,tick=F)



	text(my,max(p)/2,"95 %",font=2)
	text(my-(2.3*sigma),max(p)/18,"2.5 %",font=2)
	text(my+(2.3*sigma),max(p)/18,"2.5 %",font=2)