Abstract:
Two-phase designs involve measuring extra variables on a subset of the cohort
where some variables are already measured. The goal of two-phase designs is to
choose a subsample of individuals from the cohort and analyse that subsample
efficiently. It is of interest to obtain an optimal design that gives the most
efficient estimates of regression parameters. In this paper, we propose a
multi-wave sampling design to approximate the optimal design for design-based
estimators. Influences functions are used to compute the optimal sampling
allocations. We propose to use informative priors on regression parameters to
derive the wave-1 sampling probabilities because any pre-specified sampling
probabilities may be far from optimal and decrease efficiency. Generalised
raking is used in statistical analysis. We show that a two-wave sampling with
reasonable informative priors will end up with higher precision for the
parameter of interest and be close to the underlying optimal design.