代做BIOS5001 / PUBH6001 Introduction to Biostatistics 2024 – 2025 (Term 1) Assignment 3代做Statistics统计

- 首页 >> Algorithm 算法

BIOS5001 / PUBH6001 Introduction to Biostatistics

2024 – 2025 (Term 1)

Assignment 3

1. (viral.sav) The SPSS dataset viral.sav contains 6 variables measured on 24 HIV positive subjects:

Age = age of patient in years

Risk = 1 if patients risk factor was MSM or 2 if risk factor was heterosexual

Days = days from symptom onset until blood sample was taken

CD4 = CD4 cell count in 106 per liter

Viral = Blood viral load

Lgviral = Log10(viral load)

Your goal is to find the best linear regression model for predicting blood viral load (outcome variable), in terms of either Viral or Lgviral , using the other 4 variables as potential predictors.

(a)  From the six scatterplots of the two potential outcome variables vs. the three quantitative predictor variables, decide which of the outcome variables (Viral or Lgviral) would be more  appropriate  for  linear regression  analysis  (6 marks). Based on what characteristic of the scatterplots did you make the choice (10 marks)?

(b)  Use backward elimination to determine a model for predicting the blood viral load of a future patient and show the “Coefficients” table in the SPSS/PSPP output (10 marks). Write down the equation of your final model (8 marks).

(c)  From your final model in (b), what would be the fitted value (8 marks) and residual (8 marks) for the first subject in the data set who was 28 years old, had a CD4 count of 361, had 24 days between onset of symptoms and sampling and had the “MSM” risk factor, given the observed blood viral load is  186208.71, or equivalently log10  of observed blood viral load is 5.27? (Please write down the calculation steps.)

2. (disease.sav) The SPSS dataset disease.sav contains 3 variables and 200 cases suffering from Disease A.

status = 1 meaning the patient died and = 0 survived

agemid = midpoint of the age group to which the patient belonged

gender = 0 for females and 1 for males

(a)  Please conduct a simple (univariate) logistic regression, with status as the outcome variable and gender as the predictor variable (female as reference level) and show the “Variables in the Equation” table in the SPSS/PSPP output (10 marks). What is the odds ratio (males:females) for mortality (4 marks)? Is there a statistically significant difference in mortality between males and females (please explain using the p-value from your output) (6 marks)?

(b)   Please  conduct a multiple logistic regression with status as the outcome variable and both gender (female as reference level) and agemid as predictors and show the “Variables in the Equation” table in the SPSS/PSPP output (10 marks). What is the adjusted odds ratio (males:females) for mortality (4 marks)? Is the difference in mortality between males and females statistically significant adjusted for other variables in the model (please explain using the p-value from your output) (6 marks)? How would you explain the change in the coefficient of gender between the two models (10 marks)?


站长地图