**Due date**: Oct. 3, 2017.

- A researcher has developed a performance score for high schools that she believes gives a
reasonable indication of how well a district prepares its students for college. She examines 100
high schools from a variety of school districts and obtains her performance score for each of them.
She also obtains the average college freshman year GPA for random samples of students from each of
these high schools. The results are given below. Interpret this information. Now suppose that a
particular high school has a performance score of 220 and the average GPA of college freshmen from
this high school is 2.80. What can you say about the relative standing of this high school? What
conclusions can you draw about the actual GPA of students from this high school compared to the GPA
predicted by its performance score? Describe what possible problems might exist if performance
scores are used to predict GPA based on this data.
*mean performance = 156, s.d. = 30**mean GPA = 2.65, s.d. = .4**r = 0.72* - Use the data contained in the file

http://www.utdallas.edu/~ammann/stat3355scripts/TrackRecords.csv

This data represents the national record times for males in track races. The first column gives the country names so it should be used for row names.- a)
- Report the means and standard deviations for each race.
- b)
- Which countries are more than 2 sd's above the mean for the Marathon? Which are more than 2 sd's below the mean for the 100 meter race?
- c)
- Construct a graphics page that contains two plots stacked vertically (see graphical
parameter
*mfrow*). These plots should be informative and visually appealing, the first of which should show how the Marathon record times are related to the 100 meter record times and the second should show how Marathon record times are related to the 1500 meter times. Superimpose on each plot the least squares regression line to predict Marathon record times based on the corresponding race record times, and report r-squared for these variables on the respecive plots.

- Use the data contained in the file

http://www.utdallas.edu/~ammann/stat3355scripts/Sleep.data

A description of this data is given in

http://www.utdallas.edu/~ammann/stat3355scripts/Sleep.txt

The*Species*column should be used as row names.- a)
- Construct histograms of each variable.
- b)
- The strong asymmetry for all variables except
*Sleep*indicates that a*log*transformation is appropriate for those variables. Construct a new data frame that contains*Sleep*, replaces*BodyWgt, BrainWgt, LifeSpan*by their log-transformed values, and then construct histograms of each variable in this new data frame with all of them on the same graphics page. - c)
- Plot
*LifeSpan*vs*BodyWgt*with*LifeSpan*on the y-axis and include an informative title. Repeat using the log-transformed variables instead. Superimpose lines corresponding to the respective means of the variables for each plot. - d)
- What proportion of species are within 2 s.d.'s of mean
*LifeSpan*? What proportion are with 2 s.d.'s of mean*BodyWgt*? Answer these for the original variables and for the log-transformed variables. - e)
- Obtain and interpret the correlation between
*LifeSpan*and*BodyWgt*. Repeat for*log(LifeSpan)*and*log(BodyWgt)*. - f)
- Obtain the least squares regression line to predict
*LifeSpan*based on*BodyWgt*. Check assumptions with appropriate residual plots. Repeat to predict*log(LifeSpan)*based on*log(BodyWgt)*. Predict*LifeSpan*of*Homo sapiens*based on each of these regression lines. Which would you expect to have the best overall accuracy? Which prediction is closest to the actual*LifeSpan*of*Homo sapiens*?

**Note**: if **X** is the name of a data frame in **R** that contains two variables, say
and you would like to create a new data frame with log-transformed values of the variables
in **X**, then you can create a new object, named for example **Xl**, that is assigned the
value **X** and then log-transform variables in this new data frame.

Kl = X names(X1) = paste("logV",1:2,sep="") X1$logV1 = log(X$V1) X1$logV2 = log(X$V2)

2017-11-16