辅导STA 138、辅导corner留学生、R程序设计调试、辅导R语言
- 首页 >> 其他STA 138 Winter 2019
Homework 4 - Due Friday, Feb 15th
Book Portion (does not require R)
Note: This may be hand written or typed. Answers
should be clearly marked. Please put your name in
the upper right corner.
1. A logistic regression model was fit, where Y = 1 indicates
they defaulted on a loan, and Y = 0 indicates they did
not. The explanatory variable was X = loan balance (in
dollars). The estimated regression model is:
ln(π) = 10.4522 + 0.005368X
Further, SE(β1) = 0.000306, and you may assume the
minimum value for the loan was 0, and the maximum was
3000.
(a) Interpret exp(?10.4522) in terms of the problem.
(b) Interpret exp(0.005368) in terms of the problem.
(c) Predict the probability that someone defaults on a loan
when their balance is 2000.
(d) Does the probability that a subject defaults on a loan
increase or decrease with loan balance? Explain your
answer.
2. Continue with problem 1.
(a) Find the 95% Wald confidence interval for the value of
exp(β1).
(b) Interpret your interval in terms of the problem.
(c) What is the largest change in odds of defaulting we
would expect when the loan amount increases by 500?
(d) Would a confidence interval for exp(β0) be useful in
this case? Explain your answer.
3. A logistic regression model was fit, where Y = 0 indicates
the subject does not smoke, and Y = 1 indicates they
subject did smoke. The explanatory variable was X =
smoking status of parents (Yes if at least one parent
smoked, No if no parents smoked). The estimated regression
model is:
ln(π) = 1.8266 + 0.4592XY es
Further, SE(1) = 0.08782.
(a) Interpret exp(?1.8266) in terms of the problem.
(b) Interpret exp(0.4592) in terms of the problem.
(c) Write down the two separate models suggested by the
categorical variable.
(d) Predict the probability of the subject smoking if their
parent did not smoke.
4. Continue with problem 3.
(a) Test if H0 : β1 = 0, and list the test-statistic, p-value,
and conclusion
(b) Interpret the p-value in terms of the problem.
(c) Find the 99% confidence interval for exp(β1).
(d) Interpret the interval in terms of the problem.
5. Answer the following with TRUE or FALSE. It is good
practice to explain your answer.
(a) If the confidence interval for exp(β1) contains 1, it
suggests no influence of X1 on the odds of the trait
(Y = 1).
(b) exp(β1) gives the estimate for the odds of the trait
(Y = 1).
(c) β
0 always has a practical interpretation.
(d) If we fail to reject H0 : β1 = 0, we conclude that X1
does not effect the probability of the trait.
R Portion (requires some use of R)
Note: You do not have to use R Markdown to turn
in the homework, but the homework must be turned
in in a reasonable format. The answers to the questions
should be in the body of the homework, and the
code used to obtain those answers should be in an appendix.
There should be no code in the body of the
homework. You can accomplish this in R, Word, LaTex,
Google Docs, etc. This portion should be printed
out and turned in with the hand-written portion.
I. In the file college.csv on Canvas, you will find data
on education aspirations of high school students based
on family income. The columns are values of X (family
income), values of Y (aspirations), and scores for X, Y
labeled ui and vj respectively.
(a) Using R, find the p-value for Pearsons test statistic
for testing independence.
(b) Using R, find the p-value of the MantelYates/Mantel-Haenzel
test statistic for testing
independence.
(c) Based on the data, which do you think was more
appropriate? Explain.
(d) What would your conclusion be for the test-statistic
you deemed most appropriate?
II. Online you will find a dataset flu.csv. The columns
we are interested in are shot (1 indicates flu shot, 0
indicates no flu shot), and age (the age of the subject).
(a) Fit the logistic regression model and write down the
estimated logistic-regression function.
(b) Using (a), estimate the probability that a 55 year
old will get the flu shot.
(c) Based on the sign of β1, as your age increases does
the probability of a flu shot go up or down?
(d) Interpret the value of exp(β?
1).
(e) Interpret the value of exp(β?
0), if appropriate.
III. Continue with question II.
(a) Find the Wald and LR test-statistic for testing if β1
= 0.
(b) Find the p-values for the test-statistics in (a).
(c) Interpret one of the p-values in (b) in terms of the
problem.
(d) State the conclusion of the hypothesis test in terms
of the problem.
(e) Find the 99% LR confidence interval for β1, and
interpret it in terms of the problem.
IV. Online you will find a dataset flu2.csv. The columns
we are interested in are shot (1 indicates flu shot, 0
indicates no flu shot), and gender (M or F).
(a) Find the estimated logistic regression function.
(b) Using (a), write down the two separate models suggested
by the categorical variable.
(c) Interpret the value of exp(β1).
(d) Interpret the value of exp(β0).
(e) Test to see if Gender can be dropped from the model.
State the test-statistic, p-value, and conclusion.
V. Online you will find a dataset heart.csv. The columns
we are interested in are CHD (1 indicates coronary heart
disease (CHD), 0 indicates no CHD), and age (the age
of the subject). These data come from Hosmer, D.W.,
Lemeshow, S. and Sturdivant, R.X. (2013) Applied Logistic
Regression: Third Edition.
(a) Fit the logistic regression model and write down the
estimated logistic-regression function.
(b) Using (a), estimate the probability that a 69 year
old will have CHD.
(c) Based on the sign of β1, as your age increases does
the probability of CHD go up or down?
(d) Interpret the value of exp(β?
1).
VI. Continue with question V.
(a) Find the Wald and LR test-statistic for testing if β1
= 0.
(b) Find the p-values for the test-statistics in (a).
(c) Interpret one of the p-values in (b) in terms of the
problem.
(d) State the conclusion of the hypothesis test in terms
of the problem.
(e) Find the 90% LR confidence interval for exp(β1),
and interpret it in terms of the problem.