You are here

Estimating population variance

26 January, 2016 - 11:31

Another common interval estimation task is to estimate the variance of a population. High quality products not only need to have the proper mean dimension, the variance should be small. The estimation of population variance follows the same strategy as the other estimations. By choosing a sample and assuming that it is from the middle of the population, you can use a known sampling distribution to find a range of values that you are confident contains the population variance. Once again, we will use a sampling distribution that statisticians have discovered forms a link between samples and populations.

Take a sample of size n from a normal population with known variance, and compute a statistic called " \chi ^{2} " (pronounced "chi square") for that sample using the following formula:

\chi^{2}=\frac{\sum (x-\bar{x})^{2}}{\sigma^{2}}

You can see that \chi^{2} will always be positive, because both the numerator and denominator will always be positive. Thinking it through a little, you can also see that as n gets larger, \chi^{2} ,will generally be larger since the numerator will tend to be larger as more and more (x-\bar{x})^{2} are summed together. It should not be too surprising by now to find out that if all of the possible samples of a size n are taken from any normal population, that when \chi^{2} is computed for each sample and those \chi^{2} are arranged into a relative frequency distribution, the distribution is always the same.

Because the size of the sample obviously affects \chi^{2} , there is a different distribution for each different sample size. There are other sample statistics that are distributed like \chi^{2} , so, like the t-distribution, tables of the \chi^{2} distribution are arranged by degrees of freedom so that they can be used in any procedure where appropriate. As you might expect, in this procedure, df = n-1. A portion of a \chi^{2} table is reproduced below.

media/image6.png
Figure 3.1 The χ2 distribution 

Variance is important in quality control because you want your product to be consistently the same. John McGrath has just returned from a seminar called "Quality Socks, Quality Profits". He learned something about variance, and has asked Kevin to measure the variance of the weight of Foothill's socks. Kevin decides that he can fulfill this request by using the data he collected when Mr. McGrath asked about the average weight of size 11 men's dress socks. Kevin knows that the sample variance is an unbiased estimator of the population variance, but he decides to produce an interval estimate of the variance of the weight of pairs of size 11 men's socks. He also decides that .90 confidence will be good until he finds out more about what Mr. McGrath wants.

Kevin goes and finds the data for the size 11 socks, and gets ready to use the \chi^{2} distribution to make a .90 confidence interval estimate of the variance of the weights of socks. His sample has 15 pairs in it, so he will have 14 df. From the \chi^{2} table he sees that .95 of \chi^{2} are greater than 6.57 and only .05 are greater than 23.7 when there are 14 df. This means that .90 are between 6.57 and 23.7. Assuming that his sample has a \chi^{2} that is in the middle .90, Kevin gets ready to compute the limits of his interval. He notices that he will have to find \sum (x-\bar{x})^{2} and decides to use his spreadsheet program rather than find (x-\bar{x})^{2} fifteen times. He puts the original sample values in the first column, and has the program compute the mean. Then he has the program find (x-\bar{x})^{2} fifteen times. Finally, he has the spreadsheet sum up the squared differences and finds 0.062.

Kevin then takes the \chi^{2} formula, and solves it twice, once by setting \chi^{2} equal to 6.57:
\chi^{2}=6.57=.062/\sigma^{2}

Solving for \sigma^{2}, he finds one limit for his interval is .0094. He solves the second time by setting \chi^{2}=23.6:
23.6=.062/\sigma^{2}

and find that the other limit is .0026. Armed with his data, Kevin reports to Mr. McGrath that "with .90 confidence, the variance of weights of size 11 men's socks is between .0026 and .0094."