You are here

Privacy-Preserving Distributed Queries for a Clinical Case Research Network.

Printer-friendly versionPrinter-friendly version
Schadow G, Grannis S, McDonald CJ
IEEE International Conference on Data Mining Workshop on Privacy, Security, and Data Mining City: Maebashi City, Japan 2002 vol: 14 pp: 55-65.
Abstract: 
We present the motivation, use-case and requirements of a clini- cal case research network that would allow biomedical research- ers to perform retrospective analysis on de-identified clinical cases joined across a large scale (nationwide) distributed net- work. Based on semi-join adaptive plans for fusion-queries, in this paper we discuss how joining can be done in a way that protects the privacy of the individual patients involved. Our method is based on a cryptographically strong keyed-hash algo- rithm (HMAC.) These hash values are truncated and the result- ing hash-collisions in semi-join filters are exploited to limit the ability of an apprentice-site to re-identify patients in the filter. As a measure of privacy we use likelihood ratios. Since the join key is based on real person identifiers, we need to apply the methods of record linkage to hashing and semi-join filters. We find that multiple disjunctive rules as used in deterministic matching, lead here to a higher privacy risk than rules based on a single identifier vector.
Schadow G, Grannis S, McDonald CJ. Privacy-Preserving Distributed Queries for a Clinical Case Research Network. IEEE International Conference on Data Mining Workshop on Privacy, Security, and Data Mining City: Maebashi City, Japan 2002 vol: 14 pp: 55-65.