next up previous
Next: Practical Guidelines for Monte Up: Examples Previous: Examples

Demonstration of statistical or probabilistic effects

This example considers how we could illustrate the performance of a confidence interval to estimate a population parameter. First consider large sample confidence intervals for a population proportion.

Simulation goals:
1. Illustrate how the central limit theorem is applicable. 2. Demonstrate that it is the interval that is random, not the parameter. That is, the interval is what changes from sample to sample.
3. Demonstrate how the widths of confidence intervals depend on sample size.
4. Show that the probability the interval contains the population proportion p does not depend on p.
5. Construct scripts for this demonstration that could be adapted easily to other types of confidence intervals.

To satisfy goal 1, we can simulate N random samples from Binomial(n,p), construct confidence intervals for each sample, and determine whether or not each interval contains p. Parameters for this sinmulation would include

\begin{displaymath}
N, n, p, \alpha
\end{displaymath}

Visualizations for this demonstration include:
1. Display the histogram and quantile-quantile plot for sample proportions of the simulated samples.
2. Display the intervals in a way that shows how they vary from sample to sample and include an indicator for which intervals contain p and which do not. This can be accomplished by plotting vertical bars representing the intervals across the x-axis. Different colors for these bars can be used to differentiate between intervals that contain p and those that do not.

The script to accomplish these goals should begin with specification of the simulation parameters so that they can be modified easily. This is followed by obtaining the simulated samples, constructing the confidence intervals, and then determining which intervals contain p.

### parameters ###
N = 200 #number of simulated samples
n = 400 #sample size
p = .36 #population proportion
alpha = .05 #for 95% confidence interval
conf.int.colors = c("cyan","red") #red-green is not a good choice in case a viewer is color-blind
### simulate samples ###
Xall = rbinom(N,n,p)/n # sample proportions
### construct intervals and store them in a matrix with 2 columns
zval = qnorm(1-alpha/2)
sig = sqrt(p*(1-p)/n)
sighat = sqrt(Xall*(1-Xall)/n)
conf.mat = matrix(0,N,2)
conf.mat[,1] = Xall - zval*sighat
conf.mat[,2] = Xall + zval*sighat
### indicate which do not contain p
ci.out = conf.mat[,1] > p | conf.mat[,2] < p
cnt.out = sum(ci.out) #number of intervals that missed p
conf.col = rep(conf.int.colors[1],N)
conf.col[ci.out] = conf.int.colors[2]
### show histogram and QQ-plot
hist(Xall,col="cyan",main="")
title(paste("Histogram of Sample Proportions\nfrom",N,"Random Samples"))
title(sub=paste("Sample size =",n))
### plot confidence intervals ###
### add each bar sequentially for in-class display
Y.lim = range(conf.mat)
plot(0,0,xlim=c(0,N),ylim=Y.lim,xlab="",ylab="",type="n")
title(paste("Simulation of",paste(round(100*(1-alpha)),"%",sep=""),
    "Confidence Intervals for", paste("Binom(",p,")",sep="")))
mtext(paste("Sample size =",n),line=.25)
for(k in seq(N)) {
    rect(k-1,conf.mat[k,1],k,conf.mat[k,2],col=conf.col[k])
}
mtext(paste(paste(round(100*cnt.out/N,1),"%",sep=""),
    "of these confidence intervals do not contain"%,p),side=1,line=3)
abline(h=p)
### repeat confidence interval plot using n=1000
### keep same y-limits
n = 1000 #sample size
### simulate samples ###
all = rbinom(N,n,p)/n # sample proportions
### construct intervals and store them in a matrix with 2 columns
zval = qnorm(1-alpha/2)
sig = sqrt(p*(1-p)/n)
sighat = sqrt(Xall*(1-Xall)/n)
conf.mat = matrix(0,N,2)
conf.mat[,1] = Xall - zval*sighat
conf.mat[,2] = Xall + zval*sighat
### indicate which do not contain p
ci.out = conf.mat[,1] > p | conf.mat[,2] < p
cnt.out = sum(ci.out) #number of intervals that missed p
conf.col = rep(conf.int.colors[1],N)
conf.col[ci.out] = conf.int.colors[2]
### plot confidence intervals ###
### add each bar sequentially for in-class display
plot(0,0,xlim=c(0,N),ylim=Y.lim,xlab="",ylab="",type="n")
title(paste("Simulation of",paste(round(100*(1-alpha)),"%",sep=""),
    "Confidence Intervals for",paste%("Binom(",p,")",sep="")))
mtext(paste("Sample size =",n),line=.25)
for(k in seq(N)) {
  rect(k-1,conf.mat[k,1],k,conf.mat[k,2],col=conf.col[k])
}
mtext(paste(paste(round(100*cnt.out/N,1),"%",sep=""),
    "of these confidence intervals do not contain"%,p),side=1,line=3)
abline(h=p)

Note that we have used the same code twice, so let's define a function that returns the matrix of confidence limits. Arguments for this function should be parameters of the simulation. Then define a function that plots the intervals. Include arguments for specifying whether or not we want to include histogram and qqnorm plots. These functions are defined in the script:
http://www.utdallas.edu/~ammann/stat6341scripts/simConfInt.r
Examples of its use are in
http://www.utdallas.edu/~ammann/stat6341scripts/confsim.r

Now suppose we wish to extend these functions to include confidence intervals for a mean.
http://www.utdallas.edu/~ammann/stat6341scripts/simConfInt1.r
Examples are in
http://www.utdallas.edu/~ammann/stat6341scripts/confsim1.r

Further extend to include confidence intervals for a s.d.
http://www.utdallas.edu/~ammann/stat6341scripts/simConfInt2.r
Examples are in
http://www.utdallas.edu/~ammann/stat6341scripts/confsim2.r
Additional examples that simulate heavy-tailed and asymmetric distributions are in
http://www.utdallas.edu/~ammann/stat6341scripts/confsim3.r


next up previous
Next: Practical Guidelines for Monte Up: Examples Previous: Examples
Larry Ammann
2017-10-17