STAT 3A03讲解、辅导CS/python设计、讲解SAS留学生
- 首页 >> Python编程STAT 3A03 Applied Regression With SAS
Assignment 4 Due at 5:00pm on Friday November 30, 2018.
Dropboxes for assignment submission are outside HH-105. Your assignment MUST be deposited
in the appropriate dropbox by your last name.
For question(s) need to use SAS,
Please submit all plots/results that are required in the questions. Please provide a descriptive
answer (including equations as appropriate) respective to each part of the question.
Please print out and submit your SAS code. Please label and comment your SAS code as I
showed you in your lab exercise.
N.B. Late assignments will not be accepted
Q. 1 Edgar Anderson collected data to quantify the morphologic variation of Iris flowers of two
related species. Anderson would like to know if the relationship between Sepal Width and
Petal Length is different across two species of Iris flowers Setosa and Versicolor. Use the
following output:
1
a) Write down the regression model.
b) Write down the design matrix x.
c) Briefly describe the difference of Type I SS and Type III SS. Which type of SS is more
appropriate in this case? Justify your answer.
d) Find the 95% confidence interval for the difference between the two slope. Give an
interpretation of this confidence interval.
Q. 2 A consumer’s group conducted an experiment to compare the effectiveness of three commercially
available weight-loss diets, A, B and C respectively. They are interested in determining:
(i) Are all diets similar in the amount they reduce weight?
(ii) If they are not similar does the effect of the initial weight on weight loss vary from diet
to diet?
Thirty volunteers were randomly assigned to a diet (10 to each diet) for a one month period.
Their weights (in pounds) were recorded at the beginning and end of the month (see
weightloss.csv).
a) Fit an appropriate model to test if there is a difference between the mean weight loss
across the three diets and report your findings.
b) Fit the model which assume the initial weight affects the weight loss the same way for
all three diets and report your findings.
c) Fit the model which assume the initial weight affects the weight loss differently across
the three diets and report your findings.
d) What model would you choose for this data? Justify your answer.
Q. 3 The data on crime-related statistics for 47 U.S. states in 1960 are given in the file “crimerate.csv”.
a) Use backward selection and forward selection to select independent variables and compare
the final models, where we set both alpha to add and alpha to drop as α = 0.1.
b) Use information criteria AIC and BIC to do all subset model selection and compare
the final models.
c) Bonus Perform 5-fold cross-validation for the models you choose in b). Which model
is the better model in terms of the averaged CV (SSE).