In the previous section we used the sample standard deviation as an estimate of the population
standard deviation, so it is natural to consider how good is this estimate. **Note**:
*this section is only applicable if the population is approximately normally distributed*.

Suppose we have a population of measurements with mean and variance , and we have
randomly selected a sample of size *n* from this population. To determine how to construct
confidence intervals for we can use a similar thought experiment to what we considered
for the estimation of a population proportion. Suppose we could obtain every possible sample of size
*n* from this population and computed the sample variance for each of these samples. The
experiment in which we randomly select a single sample of size *n* and compute the sample
variance of this sample would be equivalent to randomly selecting a single sample variance from the
population of all possible sample variances. Therefore, probability statements about the sample
variance could be derived from the distribution of all possible sample variances in a way that is
similar to how we constructed confidence intervals for a population proportion and population
mean. Statistical theory tells us that if the population distribution is approximately a normal
curve, then the distribution of

has a distribution that is referred to as a

Since the chi-square distribution is not symmetric, we need to find upper and lower values from this
distribution such that the area between them is the required level of confidence for the confidence
interval. Let *Clower* denote the value from the chi-square distribution with *n-1*
degrees of freedom such that the area below it is and let *Cupper* denote the
value from the chi-square distribution such that the area above it is . Then we can make
the following probability statement,

The confidence interval for is derived by manipulating the inequality in this probability so that is between the inequalities. We can do this as follows:

and

Combine these inequalities to obtain

Therefore a confidence interval for is

Suppose for example that we wish to construct a 95% confidence interval for the population variance
of the difference between list and sales price based on the sample of size 22 in the example above.
In this case we would use a chi-square distribution with 21 degrees of freedom. Quantiles from the
Chi-square distribution are obtained in **R** using the function `qchisq()`.

alpha = .05 n = 22 s = 1150 Cl = qchisq(alpha/2,n-1) Cu = qchisq(1-alpha/2,n-1) conf.int = (n-1)*s^2/c(Cu,Cl) conf.int [1] 782789.7 2700843.7We can convert this to a confidence interval for by taking square roots of the interval. This gives

sqrt(conf.int) [1] 884.7541 1643.4244

2015-05-01