You are here

Clinical Text De-Identification Research

Printer-friendly versionPrinter-friendly version
Kayaalp M, Browne AC, Dodd Z, Sagan P, McDonald CJ
September 2013 Technical Report to the LHNCBC Board of Scientific Counselors
Abstract: 

The Privacy Rule of Health Insurance Portability and Accountability Act (HIPAA) requires that clinical documents be stripped of personally identifying information before they can be released to researchers and others. We have been developing a software tool to de-identify clinical records, which we have named NLM Scrubber. Version 1.0 of the system currently recognizes and redacts patient names, alphanumeric identifiers, addresses and dates. NLM Scrubber’s success rate of de-identifying these identifiers is around 99% and its rate of conserving text of health information with no personal identifiers is 99%, without counting de-identified provider names as false positives. We plan to release the system as an open source tool in early 2014.

Kayaalp M, Browne AC, Dodd Z, Sagan P, McDonald CJ. Clinical Text De-Identification Research September 2013 Technical Report to the LHNCBC Board of Scientific Counselors