Abstract:
This thesis is devoted to the study of the asymptotic theory for categorical data models with an emphasis on moderate or small sample sizes. The current diagnostic methods used to analyze multinomial data are generally based on first-order asymptotics. In doing so they sacrifice some information about the geometry of the model. In this thesis, we concentrate therefore on the second-order asymptotics and study the effect of ignoring the second-order terms. In particular, we study the asymptotic properties of residuals and maximum likelihood parameter estimates from the parametric multinomial family of models. In residual analysis, the goal is to use residuals which come close to behaving like their normal linear-theory counterparts. This led to the use of the so-called adjusted residuals (cf. Haberman, 1973; Rao, 1973). Unfortunately these diagnostics are valid only under the requirement of reasonably large sample sizes. One problem of using such methods is the lack of guidelines about the right sample size necessary to warrant their use. In absence of such guidelines, the validity of these methods may be questionable and alternative residuals which do not depend on this requirement may be used. One of the aims of this thesis is to construct general multinomial residuals which not only behave like the linear regression residuals but which also can be used for moderate sample sizes. These residuals take into consideration the nature of the models used by incorporating the second-order information. The diagnostic methods discussed above are useful for finding general inadequacies in a multinomial model. In particular they are useful in detecting extreme multinomial cells. A related problem which cannot be easily addressed by those methods is that of stability, or the study of the variation in the results of the analysis when problem formulation is modified. For example, they cannot be used to assess the impact of individual cells on the various aspects of the fit e.g. parameter estimates and goodness-of-fit statistics. An approach which attempts to quantify the effect of individual observations on the fit is the perturbation method. A common perturbation scheme in regression is that of case deletion. This works well as the observations are independent. In multinomial models, the terms in the log-likelihood function corresponding to the cells are not independent however. In this case, it does not make sense merely to remove a term in the likelihood function. The diagnostics akin to the ones developed for the regression models may be derived by substituting the cell probability by the conditional probability given that the suspect cell is omitted and then forming the likelihood function from the remaining cells (cf. Andersen, 1992). Andersen used this idea to derive a scalar measure of Cook's distance for multinomial models. Related and important problems which he did not examine are the changes in the other diagnostics, such as the Pearson residuals, the deviance and the likelihood displacement resulting from his perturbation scheme. We develop a likelihood theory for the conditional model and study the impact of conditioning on these diagnostics. Moreover, it is not easy to interpret the numerical quantities resulting from Andersen's scalar measure. Consequently, we propose a new Cook's distance for multinomial models that can be interpreted in much the same way as the linear regression ones. A new perturbation scheme which includes the unconditional ("full") model as a special case is also proposed and used to derive further diagnostic measures which include influence curves. Its advantage over Andersen's scheme is that it allows simple perturbations of the cells of interest and can then be used to study the effects of infinitesimal changes in multinomial observations. It enables us to unify the likelihood theory for multinomial models. We also study how the influential observations affect biases in the maximum-likelihood parameter estimators and Pearson residuals. In particular, we study how the biases change when one uses a conditional model instead of the unconditional model. It is shown that the vector of biases are functions of the weighted version of the corresponding ones for the unconditional model. This means that the linear regression theory (cf. Cook and Weisberg, 1982) can be used to express them in terms of the well known quantities for the unconditional model. The main achievement of the whole process is the extension of the differential geometric framework for multinomial models (cf. Wei, 1993, Wei and Shouye, 1995) to conditional multinomial models. This generalizes the results by the latter authors. Another problem of particular interest in this thesis is that of constructing confidence regions for the multinomial parameters. We use the asymptotic theory to construct new asymptotic regions for multinomial parameters and study the effect of including the second-order terms on them. We propose and study two competing methods of constructing these regions.