Deakin University
Browse

File(s) not publicly available

Secure Record Linkage of Large Health Data Sets: Evaluation of a Hybrid Cloud Model

Version 2 2024-06-19, 15:52
Version 1 2023-02-20, 00:44
journal contribution
posted on 2024-06-19, 15:52 authored by Adrian Paul Brown, Sean RandallSean Randall
Background The linking of administrative data across agencies provides the capability to investigate many health and social issues with the potential to deliver significant public benefit. Despite its advantages, the use of cloud computing resources for linkage purposes is scarce, with the storage of identifiable information on cloud infrastructure assessed as high risk by data custodians. Objective This study aims to present a model for record linkage that utilizes cloud computing capabilities while assuring custodians that identifiable data sets remain secure and local. Methods A new hybrid cloud model was developed, including privacy-preserving record linkage techniques and container-based batch processing. An evaluation of this model was conducted with a prototype implementation using large synthetic data sets representative of administrative health data. Results The cloud model kept identifiers on premises and uses privacy-preserved identifiers to run all linkage computations on cloud infrastructure. Our prototype used a managed container cluster in Amazon Web Services to distribute the computation using existing linkage software. Although the cost of computation was relatively low, the use of existing software resulted in an overhead of processing of 35.7% (149/417 min execution time). Conclusions The result of our experimental evaluation shows the operational feasibility of such a model and the exciting opportunities for advancing the analysis of linkage outputs.

History

Journal

JMIR MEDICAL INFORMATICS

Volume

8

Article number

ARTN e18920

Location

Canada

ISSN

2291-9694

eISSN

2291-9694

Language

English

Publication classification

C1 Refereed article in a scholarly journal

Issue

9

Publisher

JMIR PUBLICATIONS, INC

Usage metrics

    Research Publications

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC