Validity of deterministic record linkage using multiple indirect personal identifiers linking a large registry to claims data

Soko Setoguchi Iwata, Ying Zhu, Jessica J. Jalbert, Lauren A. Williams, Chih Ying Chen

Publication Date: 01/01/2014

Background-Linking patient registries with administrative databases can enhance the utility of the databases for epidemiological and comparative effectiveness research. However, registries often lack direct personal identifiers, and the validity of record linkage using multiple indirect personal identifiers is not well understood. Methods and Results-Using a large contemporary national cardiovascular device registry and 100% Medicare inpatient data, we linked hospitalization-level records. The main outcomes were the validity measures of several deterministic linkage rules using multiple indirect personal identifiers compared with rules using both direct and indirect personal identifiers. Linkage rules using 2 or 3 indirect, patient-level identifiers (ie, date of birth, sex, admission date) and hospital ID produced linkages with sensitivity of 95% and specificity of 98% compared with a gold standard linkage rule using a combination of both direct and indirect identifiers. Conclusions-Ours is the first large-scale study to validate the performance of deterministic linkage rules without direct personal identifiers. When linking hospitalization-level records in the absence of direct personal identifiers, provider information is necessary for successful linkage.