Abstract:
We describe several methods for generating synthetic data sets, using a combination of publically available marginal tables, and microdata samples. Our methods are based on fitting parsimonious statistical models to high-dimensional tables of relative frequencies, and then generating synthetic data from these models. We describe a set of R functions which implement the methods under study, and apply the methods to data from the 2001 Census of Population and Dwellings.