Abstract:
STUDY PURPOSE The purpose of this study is to evaluate and validate the completeness and the spatial and attribute accuracy of a sample of address data for 1000 patient that was randomly extracted from two national datasets; the Primary Health Organisation (PHO) Enrolment Fact table and the National Health Index (NHI) linked patient addresses. BACKGROUND The Auckland Map project aims to map vascular disease for the Auckland region using data from many sources. The spatial accuracy of vascular disease information for Auckland is a prerequisite for the success of the project, so it is necessary to conduct a data validation study on the key dataset, the PHO Enrolment Fact table. METHODS A field-by-field analysis of the 10,000 sample addresses was conducted to get a sense of potential problems in the ‘full’ datasets we will be receiving. Two parts were involved: field-by-field data validation for all 23 fields contained in the dataset, and address geocoding which included checking spatially related fields. KEY RESULTS 17 out of the 23 fields that we validated raised concerns. The key concerns are: • While PHO address and Meshblock information matched our geocoding results in at least 9,150 of 10,000 cases (91.5%), there were at least 620 (6.2%) cases where the Meshblock information we derived through geocoding process did not agree with the Meshblock information in the PHO sample dataset. • We identified a dis-synchronisation between PHO and NHI addresses. • More than 12% of values in the dom and cau_dom fields did not agree with our results. CONCLUSION The concerns we have about the completeness and the spatial and attribute accuracy of the sample patient data must be addressed. We suggest that the PHO spatial information is better suited for the Auckland Map project since it has a higher matching rate in the geocoding process than the NHI spatial information. In the future we suggest recording address information in separate fields, enforced with valid values in certain fields in order to reduce data input error and provide better address information for geocoding. We also suggest regularly synchronising patient address information in both PHO and NHI systems. Finally, we recommend that regularly check the dom and cau_dom information in NHI against Meshblock information in PHO.