Cassa, Miller, Mandl. A novel, privacy-preserving cryptographic approach for sharing sequencing data. J Am Med Inform Assoc. 2013;20:69–76.
Notes
Cassa, Christopher AMiller, Rachel AMandl, Kenneth DengHD040128/HD/NICHD NIH HHS/LM010470-01/LM/NLM NIH HHS/Research Support, N.I.H., Extramural2012/11/06 06:00J Am Med Inform Assoc. 2013 Jan 1;20(1):69-76. doi: 10.1136/amiajnl-2012-001366. Epub 2012 Nov 2.
Abstract
OBJECTIVE: DNA samples are often processed and sequenced in facilities external to the point of collection. These samples are routinely labeled with patient identifiers or pseudonyms, allowing for potential linkage to identity and private clinical information if intercepted during transmission. We present a cryptographic scheme to securely transmit externally generated sequence data which does not require any patient identifiers, public key infrastructure, or the transmission of passwords. MATERIALS AND METHODS: This novel encryption scheme cryptographically protects participant sequence data using a shared secret key that is derived from a unique subset of an individual's genetic sequence. This scheme requires access to a subset of an individual's genetic sequence to acquire full access to the transmitted sequence data, which helps to prevent sample mismatch. RESULTS: We validate that the proposed encryption scheme is robust to sequencing errors, population uniqueness, and sibling disambiguation, and provides sufficient cryptographic key space. DISCUSSION: Access to a set of an individual's genotypes and a mutually agreed cryptographic seed is needed to unlock the full sequence, which provides additional sample authentication and authorization security. We present modest fixed and marginal costs to implement this transmission architecture. CONCLUSIONS: It is possible for genomics researchers who sequence participant samples externally to protect the transmission of sequence data using unique features of an individual's genetic sequence.
Last updated on 02/25/2023