Linear Mixed Models讲解、辅导C++/R、Matlab, R, C++编程调试

- 首页 >> 其他
Midterm Project
Names:
Student IDs:
1 Problem Description
Read article: Maximum Likelihood Algorithms for Generalized Linear Mixed Models (McCulloch
1997), and try to understand the basic concept of generalized linear mixed model
(GLMM). Section 3.1 in this paper described a Monte Carlo EM (MCEM) method to derive
Maximum Likelihood Estimates (MLE). Please use your own software (Matlab, R, C++,
etc.) to perform following simulation study and answer related questions.
1.1 Model and Notations
In this project, we consider a clustering problem. Suppose we have observed n observations,
each observation is a binary process, i.e. the response Yij = 0 or 1, i = 1, ..., n,
j = 1, ..., T. Here n is the number of subjects and T is the length of observation. In general,
T might vary across subjects, time points may also be different. In this project, however, we
simply assume that all subjects have common time length and time points. We also assume
that these subjects belong to two clusters. For each cluster, the conditional expectation of
1response variable is:
Pij ≡ E(Yij |Ui = 1, X1,ij , Z1,i) = g1(β1X1,ij + Z1,i)
Pij ≡ E(Yij |Ui = 2, X2,ij , Z2,i) = g(β2X2,ij + Z2,i) (1)
where U is cluster membership, Xc,ij and Zc,i (c = 1, 2) are fixed and random effects,
respectively. The link function g(x) = exp(x)
1+exp(x)
is given. In a typical clustering problem, U
is usually unknown, and hence we treat U as another random effect.
For random effects, we assume that Zc,i ~ N(0, σ2c) and P(U = 1) = π1 (then π2 = 1?π1).
Then the parameter to be estimated is = {β1, β2, σ1, σ2, π1}. Treating random effects as
missing data, one can write the complete data likelihood function as
where fc(Zc,i) is the density function of Normal distribution, fc(Yij |Zc,i) = P
1 if subject i belongs to cluster c
0 otherwise,
1.2 Simulation Setup and Requirement
Generate 100 simulations. In each simulation, set n = 100 and T = 10. The true values of parameter are: β1 = β2 = 1, π1 = 0.6,σ1 = 2, and σ2 = 10.
Before you start, use N(0, 1) to generate the fixed effect X, and use them for all 100
simulations. Please follow the paper mentioned earlier and use MCEM to evaluate the loglikelihood
function. In the E-step, perform K = 500 Gibbs sampling incorporated with a
Metropolis-Hastings step, and drop the first 100 as a burn-in procedure.
21.3 Your Report
(1) Please write down the form of Monte Carlo average of log-likelihood (which you are
going to evaluate)
(2) Please write down details of the EM algorithm you use for this simulation, especially
the Metropolis-Hastings steps.
(3) What are your initial values? What is your convergence rule?
(4) How to accelerate your EM algorithm? Any improvement you can observe?
(4) Try different numbers of simulations: 200,300,...,1000. And plot the corresponding
MSE.
(5) Write a report in either Chinese or English. Please attach your code to your report.
3