辅导统计编程Poisson and Logistic Regression using R

- 首页 >> 其他

Read BYSH chapter 4


4.9 Exercises

4.9.1 Conceptual Exercises

(1) What are features of inferential OLS models that make them less suitable for count data?

(5) Why is the log of means, log(Y ̄ ), not Y ̄ , plotted against X when assessing the assumptions for Poisson regression? How can this assumption be checked if there are not many repeated observations at each level of X?

(6) Is it possible that a predictor is significant for a model fit using a likelihood, but not for a model for the same data fit using a quasilikelihood? Explain.

4.9.3 Poisson regression

1. Credit card use A survey of 1,000 consumers asked respondents how many credit cards they use. Interest centers on the relationship between credit card use and income in 10,000. A Poisson regression was fit, and the estimated coefficient for income is 0.744.

(a) Identify the predictor and interpret the coefficient for the predictor in this context.

(b) Describe how the assumption of linearity can be assessed in this example.

2. Tornado Damage -The number of deaths for 100 tornados was recorded along with the storm rating; F0, F1, F2, F4 or F5 and region of the country; Northeast, South, Midwest, West and Southwest.

(a) Describe how the assumption of mean = variance could be assessed here.


(b) How would the significance of the addition of region to a model with storm rating be determined? Be specific.

Logistic Regression

The following questions are taken from Introduction to categorical data analysis (Agresti, 2nd edition).


1)  Each female horseshoe crab in the study had a male crab attached to her in her nest. The study investigated factors that affect whether the female crab had any other males, called satellites, residing nearby her. The response outcome for each female crab is  “satellite” where a 1 = Yes, and 0= No satellites. An explanatory variable thought possibly to affect this was the female crab’s shell weight and width, which are summaries of her size.  For the horseshoe crab data fit a model using weight and width as predictors and “satellites” as the response.

a. Report the prediction equation.

b. Conduct a likelihood-ratio test of H0: β1 = β2 = 0. Interpret.

c. Conduct separate likelihood-ratio tests for the partial effects of each variable. Why does neither test show evidence of an effect when the test in (b) shows very strong evidence?


2)  For the horseshoe crab data, fit the logistic regression model with x = weight as the sole predictor of the presence of satellites.


a. For a classification table using the sample proportion of 0.642 as the cutoff, report the sensitivity and specificity. Interpret.

b. Form a ROC curve, and report and interpret the area under it.

c. Investigate the model goodness-of-fit using the Hosmer–Lemeshow statistic or some other model-checking approach. Interpret.

d. Inferentially compare the model to the model with x and x2 as predictors. Interpret.

e. Compare the models in (d) using the AIC. Interpret