Abstract:
The generalised linear mixed models (GLMM) is one of the most important tools for analysing clustered data. One of the main feature of clustered data is observational units within the same cluster are correlated, though observational units from different clusters may be independent. The random effects in the GLMM are used to model this correlation.The random effects in the GLMM are unobservable. Writing down an exact expression for the marginal likelihood from the GLMM involves a high dimensional integral and so is intractable when the dimension of the random effects is large.There are two different approaches to handle this problem in the literature. First, approximate the integral directly by the Laplace’s method (Breslow and Clayton, 1993; Pinheiro and Chao, 2006). Secondly, approximate the integrand or joint density by the lower dimensional object such as the product of marginal density or conditional density. This is also called the pseudo-likelihood estimation (Besag, 1974). Typically, one cannot even write down the marginal likelihood explicitly. So the Laplace’s method doesn’t apply here. But one can still use the pseudo-likelihood. Under various regularity conditions, the consistency and asymptotic normality of the pseudo-likelihood estimator have been established using generalised estimating equations (GEE). There are many ways to construct the pseudo-likelihood (Lindsay,1988; Varin et al., 2011). In this thesis, I work exclusively with the pairwise composite likelihood as it is the simplest pseudo-likelihood construction that still captures the pairwise correlation structure. I am interested in the weighted pairwise composite likelihood under complex sampling. Complex sampling is typically informative (Pfeffermann, 1996). One has to add weights in the pairwise likelihood to account for informative sampling, usually chosen to be the inverse sampling inclusion probability. Rao et al. (2013); Yiet al. (2016) considered the weighted pairwise likelihood for two-stage samples in the special case when the sampling clusters are the model clusters. They established consistency of the weighted pairwise composite likelihood estimator and suggested a variance estimator. In this thesis, I continue the study of the weighted pairwise composite likelihood estimator in complex sampling initiated in Rao et al. (2013); Yi et al. (2016). More precisely, my goal is to extend the asymptotic results of the weighted pairwise likelihood estimators to the case when the sampling clusters are not the same as the model clusters. In particular, the consistency and asymptotic normality of the weighted pairwise likelihood estimator are established. Furthermore, I show the empirical variance estimator is consistent. This is surprisingly more difficult than it first seems. It is complicated by the structure of the sampling design, where pairs in the same model clusters might not be in the same sampling clusters. I present simulation results examining the performance of the weighted pairwise likelihood estimators for a random intercept model and a random slope model under various two-stage sampling designs. Finally, the random effects in the mixed model could potentially be correlated as in spatial statistics. My goal in here is to keep extending the asymptotic properties of the weighted pairwise composite likelihood estimator under the Mat´ern spatial random intercept model. More precisely, I establish consistency and asymptotic normality of the weighted pairwise likelihood estimator under that setting.