辅导via Crowdmark、讲解Canvas留学生、讲解R编程设计、辅导R

- 首页 >> 其他
Assignment 2
Due Wednesday March 6th at noon via Crowdmark.
Question 1. (25 points) Refer to file ‘Q1.csv’ on Canvas with data about some larvae.
Consider a linear model that predicts the metabolic rate using body size (measured in
grams) as an explanatory variable. The variable names in the file are descriptive of
what’s being measured.
Consider also similar models but with a log base 10 transformation of only metabolic
rate, only body size, or both variables.
From all the options considered above, which model best satisfies the conditions that can
be assessed with residual plots and with Normal probability plots (or with Normal
quantile plots)? Explain and provide all necessary plots and all your work in R.
Question 2. Refer to file ‘MthEnr.csv’. Use spring as the response variable and fall as
the explanatory variable.
(a) (12 points) Via a plot (or plots) identify the most clear candidate to be an
influential observation in the data set. Confirm that such observation is influential by
obtaining and graphing (along with the original data) the regression equations with and
without such observation. Explain. Provide the equations for both cases.
If you want to remove the first row from a data frame called dat1 in R, you can use the
following instruction,
dat2=dat1[-c(1),]
Object dat2 contains the same information as dat1 without the first row. Similar
instructions can be used to remove any other row.
Explain and provide all necessary plots and all your work in R. Also show your original
data and the data without the influential observation.
(b) (8 points) Provide a plot with all the 11 observations and include on the same
graph the regression line using all 11 observations and the regression line with the
information for year 2011 removed. Also include both regression equations.
Can observation 11 be considered an influential observation? Explain and provide all
necessary plots and all your work in R.Question 3: Refer to dataset ‘Rails.csv’ on Canvas. Consider a model with adj2007
(estimated 2007 price in thousands of 2014 dollars) as a response variable and distance
(distance to the closest bike trail in km) and squarefeet (square footage of interior
finished space in thousands of square feet) as explanatory variables.
(a) (10 points) Assess the significance of this model using R. Show and label all steps
and show your work. Don’t forget to verify the conditions in the proper step.
(b) (2 points) Suppose that the model is significant. What is the expected estimated
2007 price in thousands of 2014 dollars when the distance to the closest trail is half a
kilometer and the interior finished space is 2 thousand square feet? Show work in R.
(c) (5 points) Suppose the model is significant and that the required conditions for
regression are satisfied. Obtain a 90% confidence interval for the mean response when
distance = 0.5 and squarefeet = 2. Also calculate! , the standard error of the point
estimate. No need to show the four-step process for the interval but do show all your
work in R.
(d) (5 points) Suppose the model is significant and that the required conditions for
regression are satisfied. Obtain a 99% prediction interval for a new response value when
distance = 1 and squarefeet = 1. Also calculate! the standard error of the point
estimate. No need to show the four-step process for the interval but do show all your
work in R.