代写STATS 763 - 2022 - Midterm test代做留学生SQL语言程序

- 首页 >> C/C++编程

STATS 763 - 2022 - Midterm test

6 April 2022, 10:00-11:30NZST

Question 1 [21 marks total]

Wilms’ tumour is a rare childhood cancer of the kidney. Treatment is successful for the majority of patients, but a minority do relapse. An important risk factor for relapse is disease stage (how far it has spread).

The following data on all U.S. paediatric Wilms’ tumour patients between

1980 and 1994, inclusively, were collected:

•  Year: Year of diagnosis, from 1980 to 1994

•  Stage: Disease stage (I [least advanced], II, III, IV [most advance])

•  rel5: Relapse within 5 years (0 [No], 1 [Yes]), hereafter called ”relapse” .

We fit a relative risk model of rel5 on Year*Stage, to capture any secular trend in relapses by disease stage, and obtain the following results:

Call:

glm(formula=  rel5~Year*Stage,  family=binomial(link="log"),  data=wilms)

Coefficients:

Estimate  Std .  Error  z  value  Pr(>|z| )

(Intercept)         52 .59226      38 .93026      1 .351      0 .1767

Year                       -0 .02767        0 .01960    -1 .412      0 .1581

StageII             -107 .15715      52 .12848    -2 .056      0 .0398  *

StageIII               40 .62429      50 .38844      0 .806      0 .4201

StageIV               -29 .17571      53.37873    -0 .547      0 .5847

Year:StageII         0 .05422        0 .02623      2 .067      0 .0387  *

Year:StageIII      -0 .02008       0 .02537    -0 .792      0 .4286

Year:StageIV          0 .01525        0 .02687      0 .568      0 .5703

Selected rows and columns from the estimated variance matrix of the coef- ficient estimates are given below:

(Intercept)       Year       StageIV    Year:StageIV

(Intercept)        1515 .6      -0 .763       -1515 .6         0 .763

Year                         -0 .763    0 .000384          0 .763    -0 .000384

StageIV             -1515 .6        0 .763          2849 .3        -1 .434

Year:StageIV           0 .763  -0 .000384        -1 .434      0 .000722

(a) [12 marks total]

We replace Year in the model by Year1980  <- Year-1980.

i. [6 marks] What are the values of the new estimates for (Intercept), Year1980, StageIV and Year1980:StageIV?

ii. [6 marks] What are the standard errors of the new estimates for Year1980, StageIV and Year1980:StageIV?

(b) [5 marks]

According to the model, what is the estimated relative risk of relapse between a patient at Stage IV in 1990 and a patient at Stage III in 1980?

(c) [4 marks]

According to the model, what is the estimated diference in log-risk of relapse corresponding to an increase in the year of diagnosis of 5 years for a patient in Stage I, and what is the standard error of this estimate?

Question 2 [16 marks total]

We t a GLM by solving the score equation Σin= xTiw i(Yi  - μi) = 0, where xi  is the ith  1 × p covariate vector, μi  = g-1(xiβ) and 0 is the 1 × pvector of all zeros. The GLM involves a dispersion parameter φ > 0.

(a) [4 marks]

Explain why wi  = 1 when the canonical link is used.

(b) [4 marks total]

What is wi  in the following settings?

i. [2  marks]  Variance function V (μ)  = μ2 , μ  ∈ R+  and link function g(μ) = log(μ).

ii. [2 marks]  Poisson(μ) family and identity link.

(c) [4 marks]

How do we usually estimate φ? Write an expression for the estimator.

(d) [4 marks]

For a certain value of β0 , you are given the observed values

(this last subscript means “evaluated at β = β0 ”).

Write down a test statistic that you can approximately compare to a χp(2) quantile to test H0  : β = β0  vs H1  : β ≠ β0 .

Question 3 [8 marks total]+4 bonus Answer the following questions:

(a) [4  marks]   You have written a scientific paper containing results from a linear regression model E[YjX = x] = xβ fitted to independent count data. The data set was large and you estimated the variance of β(^) using a sandwich estimator.

A reviewer writes that count data are not normally distributed, and there-fore your Wald confidence intervals are incorrect because 1)β(^) is therefore not normally distributed either and b) the variances are wrong because they are estimated under the wrong model.  How do you respond?

(b) [2 marks]  Describe one situation in which a quasi-likelihood model may fail to produce reliable standard errors.

(c) [2 marks]  True or False: Data sampled according to the outcome will yield biased regression estimates unless it is appropriately weighted.

(d) [4 marks]  (bonus)  A regression coe代cient estimated from a parametric gen-

eralised linear model has covariance matrix Cov(β(^)) = φ(XTX)-1 .  Find two combinations of family and link that will yield this covariance.



站长地图