STAT 3A03讲解、辅导CS/python设计、讲解SAS留学生

- 首页 >> Python编程

STAT 3A03 Applied Regression With SAS

Assignment 4 Due at 5:00pm on Friday November 30, 2018.

Dropboxes for assignment submission are outside HH-105. Your assignment MUST be deposited

in the appropriate dropbox by your last name.

For question(s) need to use SAS,

Please submit all plots/results that are required in the questions. Please provide a descriptive

answer (including equations as appropriate) respective to each part of the question.

Please print out and submit your SAS code. Please label and comment your SAS code as I

showed you in your lab exercise.

N.B. Late assignments will not be accepted

Q. 1 Edgar Anderson collected data to quantify the morphologic variation of Iris flowers of two

related species. Anderson would like to know if the relationship between Sepal Width and

Petal Length is different across two species of Iris flowers Setosa and Versicolor. Use the

following output:

1

a) Write down the regression model.

b) Write down the design matrix x.

c) Briefly describe the difference of Type I SS and Type III SS. Which type of SS is more

appropriate in this case? Justify your answer.

d) Find the 95% confidence interval for the difference between the two slope. Give an

interpretation of this confidence interval.

Q. 2 A consumer’s group conducted an experiment to compare the effectiveness of three commercially

available weight-loss diets, A, B and C respectively. They are interested in determining:

(i) Are all diets similar in the amount they reduce weight?

(ii) If they are not similar does the effect of the initial weight on weight loss vary from diet

to diet?

Thirty volunteers were randomly assigned to a diet (10 to each diet) for a one month period.

Their weights (in pounds) were recorded at the beginning and end of the month (see

weightloss.csv).

a) Fit an appropriate model to test if there is a difference between the mean weight loss

across the three diets and report your findings.

b) Fit the model which assume the initial weight affects the weight loss the same way for

all three diets and report your findings.

c) Fit the model which assume the initial weight affects the weight loss differently across

the three diets and report your findings.

d) What model would you choose for this data? Justify your answer.

Q. 3 The data on crime-related statistics for 47 U.S. states in 1960 are given in the file “crimerate.csv”.

a) Use backward selection and forward selection to select independent variables and compare

the final models, where we set both alpha to add and alpha to drop as α = 0.1.

b) Use information criteria AIC and BIC to do all subset model selection and compare

the final models.

c) Bonus Perform 5-fold cross-validation for the models you choose in b). Which model

is the better model in terms of the averaged CV (SSE).


站长地图