Find below a non-exhaustive but detailed list of scientific publications on genome privacy and security. Please contact us for any missing publications.
Search the publications via keywords.
Zhen Lin; Art B. Owen; Russ B. Altman: Genomic Research and Human Subject Privacy. Science, 305 (5681), pp. 183–183, 2004, ISSN: 0036-8075, 1095-9203. (Type: Journal Article | Abstract | Links | BibTeX) @article{lin_genomic_2004, title = {Genomic Research and Human Subject Privacy}, author = { Zhen Lin and Art B. Owen and Russ B. Altman}, url = {http://www.sciencemag.org/content/305/5681/183}, doi = {10.1126/science.1095019}, issn = {0036-8075, 1095-9203}, year = {2004}, date = {2004-07-01}, urldate = {2013-07-11}, journal = {Science}, volume = {305}, number = {5681}, pages = {183--183}, abstract = {Public genetic sequence databases are a critical part of our academic biomedical research infrastructure. However, human genetic data should only be made public if we can adequately protect the privacy of research subjects. Individual genomic sequence data (such as SNPs) are quite "identifiable" using common definitions, while our efforts to understand disease susceptibility or therapeutic opportunity require access to large genomic data sets. The authors of this Policy Forum argue that surprisingly small amounts of genomic sequence data are identifiable. Therefore, the special privacy challenges posed by genomic data need to be addressed with new policies or creative technical approaches.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Public genetic sequence databases are a critical part of our academic biomedical research infrastructure. However, human genetic data should only be made public if we can adequately protect the privacy of research subjects. Individual genomic sequence data (such as SNPs) are quite "identifiable" using common definitions, while our efforts to understand disease susceptibility or therapeutic opportunity require access to large genomic data sets. The authors of this Policy Forum argue that surprisingly small amounts of genomic sequence data are identifiable. Therefore, the special privacy challenges posed by genomic data need to be addressed with new policies or creative technical approaches. |
Bradley Malin; Latanya Sweeney: How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systems. Journal of biomedical informatics, 37 (3), pp. 179–192, 2004, ISSN: 1532-0464. (Type: Journal Article | Abstract | Links | BibTeX) @article{malin_how_2004, title = {How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systems}, author = { Bradley Malin and Latanya Sweeney}, doi = {10.1016/j.jbi.2004.04.005}, issn = {1532-0464}, year = {2004}, date = {2004-06-01}, journal = {Journal of biomedical informatics}, volume = {37}, number = {3}, pages = {179--192}, abstract = {The increasing integration of patient-specific genomic data into clinical practice and research raises serious privacy concerns. Various systems have been proposed that protect privacy by removing or encrypting explicitly identifying information, such as name or social security number, into pseudonyms. Though these systems claim to protect identity from being disclosed, they lack formal proofs. In this paper, we study the erosion of privacy when genomic data, either pseudonymous or data believed to be anonymous, are released into a distributed healthcare environment. Several algorithms are introduced, collectively called RE-Identification of Data In Trails (REIDIT), which link genomic data to named individuals in publicly available records by leveraging unique features in patient-location visit patterns. Algorithmic proofs of re-identification are developed and we demonstrate, with experiments on real-world data, that susceptibility to re-identification is neither trivial nor the result of bizarre isolated occurrences. We propose that such techniques can be applied as system tests of privacy protection capabilities.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The increasing integration of patient-specific genomic data into clinical practice and research raises serious privacy concerns. Various systems have been proposed that protect privacy by removing or encrypting explicitly identifying information, such as name or social security number, into pseudonyms. Though these systems claim to protect identity from being disclosed, they lack formal proofs. In this paper, we study the erosion of privacy when genomic data, either pseudonymous or data believed to be anonymous, are released into a distributed healthcare environment. Several algorithms are introduced, collectively called RE-Identification of Data In Trails (REIDIT), which link genomic data to named individuals in publicly available records by leveraging unique features in patient-location visit patterns. Algorithmic proofs of re-identification are developed and we demonstrate, with experiments on real-world data, that susceptibility to re-identification is neither trivial nor the result of bizarre isolated occurrences. We propose that such techniques can be applied as system tests of privacy protection capabilities. |
Bernd Blobel: Authorisation and access control for electronic health record systems. International Journal of Medical Informatics, 73 (3), pp. 251–257, 2004, ISSN: 13865056. (Type: Journal Article | Abstract | Links | BibTeX) @article{blobel_authorisation_2004, title = {Authorisation and access control for electronic health record systems}, author = { Bernd Blobel}, url = {http://linkinghub.elsevier.com/retrieve/pii/S1386505603001916}, doi = {10.1016/j.ijmedinf.2003.11.018}, issn = {13865056}, year = {2004}, date = {2004-03-01}, urldate = {2015-03-05}, journal = {International Journal of Medical Informatics}, volume = {73}, number = {3}, pages = {251--257}, abstract = {Enabling the shared care paradigm, centralised or even decentralised electronic health record (EHR) systems increasingly become core applications in hospital information systems and health networks. For realising multipurpose use and reuse as well as inter-operabilityat knowledge level, EHR have to meet special architectural requirements. The component-oriented and model-based architecture should meet international standards. Especiallyin extended health networks realising inter-organisational communication and co-operation, authorisation cannot be organised at user level anymore. Therefore, models, methods and tools must be established to allow formal and structured policydefinition, policyagreements, role definition, authorisation and access control. Based on the author’s international engagement in EHR architecture and security standards referring to the revision of CEN ENV 13606, the GEHR/open EHR approach, HL7 and CORBA, models for health-specific and EHR-related roles, for authorisation management and access control have been developed. The basic concept is the separation of structural roles defining organisational entity-to-entity relationships and enabling specific acts on the one hand, and functional roles bound to specific activities and realising rights and duties on the other hand. Aggregation of organisational, functional, informational and technological components follows specific rules. Using UML and XML, the principles as well as some examples for analysis, design, implementation and maintenance of policyand authorisation management as well as access control have been practicallyimplemented. © 2004 Elsevier Ireland Ltd. All rights reserved.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Enabling the shared care paradigm, centralised or even decentralised electronic health record (EHR) systems increasingly become core applications in hospital information systems and health networks. For realising multipurpose use and reuse as well as inter-operabilityat knowledge level, EHR have to meet special architectural requirements. The component-oriented and model-based architecture should meet international standards. Especiallyin extended health networks realising inter-organisational communication and co-operation, authorisation cannot be organised at user level anymore. Therefore, models, methods and tools must be established to allow formal and structured policydefinition, policyagreements, role definition, authorisation and access control. Based on the author’s international engagement in EHR architecture and security standards referring to the revision of CEN ENV 13606, the GEHR/open EHR approach, HL7 and CORBA, models for health-specific and EHR-related roles, for authorisation management and access control have been developed. The basic concept is the separation of structural roles defining organisational entity-to-entity relationships and enabling specific acts on the one hand, and functional roles bound to specific activities and realising rights and duties on the other hand. Aggregation of organisational, functional, informational and technological components follows specific rules. Using UML and XML, the principles as well as some examples for analysis, design, implementation and maintenance of policyand authorisation management as well as access control have been practicallyimplemented. © 2004 Elsevier Ireland Ltd. All rights reserved. |
Leslie Burnett; Kris Barlow-Stewart; Anné L Proos; Harry Aizenberg: The "GeneTrustee": a universal identification system that ensures privacy and confidentiality for human genetic databases. Journal of law and medicine, 10 (4), pp. 506–513, 2003, ISSN: 1320-159X. (Type: Journal Article | Abstract | BibTeX) @article{burnett_genetrustee:_2003, title = {The "GeneTrustee": a universal identification system that ensures privacy and confidentiality for human genetic databases}, author = { Leslie Burnett and Kris Barlow-Stewart and Anné L Proos and Harry Aizenberg}, issn = {1320-159X}, year = {2003}, date = {2003-05-01}, journal = {Journal of law and medicine}, volume = {10}, number = {4}, pages = {506--513}, abstract = {This article describes a generic model for access to samples and information in human genetic databases. The model utilises a "GeneTrustee", a third-party intermediary independent of the subjects and of the investigators or database custodians. The GeneTrustee model has been implemented successfully in various community genetics screening programs and has facilitated research access to genetic databases while protecting the privacy and confidentiality of research subjects. The GeneTrustee model could also be applied to various types of non-conventional genetic databases, including neonatal screening Guthrie card collections, and to forensic DNA samples.}, keywords = {}, pubstate = {published}, tppubtype = {article} } This article describes a generic model for access to samples and information in human genetic databases. The model utilises a "GeneTrustee", a third-party intermediary independent of the subjects and of the investigators or database custodians. The GeneTrustee model has been implemented successfully in various community genetics screening programs and has facilitated research access to genetic databases while protecting the privacy and confidentiality of research subjects. The GeneTrustee model could also be applied to various types of non-conventional genetic databases, including neonatal screening Guthrie card collections, and to forensic DNA samples. |
Sheri A. Alpert: Protecting Medical Privacy: Challenges in the Age of Genetic Information. Journal of Social Issues, 59 (2), pp. 301–322, 2003, ISSN: 1540-4560. (Type: Journal Article | Abstract | Links | BibTeX) @article{alpert_protecting_2003, title = {Protecting Medical Privacy: Challenges in the Age of Genetic Information}, author = { Sheri A. Alpert}, url = {http://onlinelibrary.wiley.com/doi/10.1111/1540-4560.00066/abstract}, doi = {10.1111/1540-4560.00066}, issn = {1540-4560}, year = {2003}, date = {2003-01-01}, urldate = {2013-04-12}, journal = {Journal of Social Issues}, volume = {59}, number = {2}, pages = {301--322}, abstract = {This article examines the privacy issues that arise from the convergence of two trends: the computerization of medical records, and the increasingly detailed level of personal genetic information that will potentially be placed within the electronic medical record. The article discusses the privacy and public policy implications for medical care, group identity, and familial relationships arising from the transition toward electronic medical records which will increasingly contain highly detailed genetic information. As such, the article focuses on the confidentiality of the electronic medical record, the increasing prevalence and sophistication of genetic testing and analysis, and the implications of electronic genetic information.}, keywords = {}, pubstate = {published}, tppubtype = {article} } This article examines the privacy issues that arise from the convergence of two trends: the computerization of medical records, and the increasingly detailed level of personal genetic information that will potentially be placed within the electronic medical record. The article discusses the privacy and public policy implications for medical care, group identity, and familial relationships arising from the transition toward electronic medical records which will increasingly contain highly detailed genetic information. As such, the article focuses on the confidentiality of the electronic medical record, the increasing prevalence and sophistication of genetic testing and analysis, and the implications of electronic genetic information. |
Zhen Lin; Michael Hewett; Russ B Altman: Using binning to maintain confidentiality of medical data. Proceedings / AMIA ... Annual Symposium. AMIA Symposium, pp. 454–458, 2002, ISSN: 1531-605X. (Type: Journal Article | Abstract | BibTeX) @article{lin_using_2002, title = {Using binning to maintain confidentiality of medical data}, author = { Zhen Lin and Michael Hewett and Russ B Altman}, issn = {1531-605X}, year = {2002}, date = {2002-01-01}, journal = {Proceedings / AMIA ... Annual Symposium. AMIA Symposium}, pages = {454--458}, abstract = {Biomedical informatics in general and pharmacogenomics in particular require a research platform that simultaneously enables discovery while protecting research subjects' privacy and information confidentiality. The development of inexpensive DNA sequencing and analysis technologies promises unprecedented database access to very specific information about individuals. To allow analysis of this data without compromising the research subjects' privacy, we must develop methods for removing identifying information from medical and genomic data. In this paper, we build upon the idea that binned database records are more difficult to trace back to individuals. We represent symbolic and numeric data hierarchically, and bin them by generalizing the records. We measure the information loss due to binning using an information theoretic measure called mutual information. The results show that we can bin the data to different levels of precision and use the bin size to control the tradeoff between privacy and data resolution.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Biomedical informatics in general and pharmacogenomics in particular require a research platform that simultaneously enables discovery while protecting research subjects' privacy and information confidentiality. The development of inexpensive DNA sequencing and analysis technologies promises unprecedented database access to very specific information about individuals. To allow analysis of this data without compromising the research subjects' privacy, we must develop methods for removing identifying information from medical and genomic data. In this paper, we build upon the idea that binned database records are more difficult to trace back to individuals. We represent symbolic and numeric data hierarchically, and bin them by generalizing the records. We measure the information loss due to binning using an information theoretic measure called mutual information. The results show that we can bin the data to different levels of precision and use the bin size to control the tradeoff between privacy and data resolution. |
Bruce Walsh: Estimating the Time to the Most Recent Common Ancestor for the Y chromosome or Mitochondrial DNA for a Pair of Individuals. Genetics, 158 (2), pp. 897–912, 2001, ISSN: 0016-6731, 1943-2631. (Type: Journal Article | Abstract | Links | BibTeX) @article{walsh_estimating_2001, title = {Estimating the Time to the Most Recent Common Ancestor for the Y chromosome or Mitochondrial DNA for a Pair of Individuals}, author = { Bruce Walsh}, url = {http://www.genetics.org/content/158/2/897}, issn = {0016-6731, 1943-2631}, year = {2001}, date = {2001-06-01}, urldate = {2013-07-04}, journal = {Genetics}, volume = {158}, number = {2}, pages = {897--912}, abstract = {Bayesian posterior distributions are obtained for the time to the most recent common ancestor (MRCA) for a nonrecombining segment of DNA (such as the nonpseudoautosomal arm of the Y chromosome or the mitochondrial genome) for two individuals given that they match at k out of n scored markers. We argue that the distribution of the time t to the MRCA is the most natural measure of relatedness for such nonrecombining regions. Both an infinite-alleles (no recurring mutants) and stepwise mutation model are examined, and these agree well when n is moderate to large and k/n is close to one. As expected, the infinite alleles model underestimates t relative to the stepwise model. Using a modest number (20) of microsatellite markers is sufficient to obtain reasonably precise estimates of t for individuals separated by 200 or less generations. Hence, the multilocus haplotypes of two individuals can be used not only to date very deep ancestry but also rather recent ancestry as well. Finally, our results have forensic implications in that a complete match at all markers between a suspect and a sample excludes only a modest subset of the population unless a very large number of markers (textgreater500 microsatellites) are used.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Bayesian posterior distributions are obtained for the time to the most recent common ancestor (MRCA) for a nonrecombining segment of DNA (such as the nonpseudoautosomal arm of the Y chromosome or the mitochondrial genome) for two individuals given that they match at k out of n scored markers. We argue that the distribution of the time t to the MRCA is the most natural measure of relatedness for such nonrecombining regions. Both an infinite-alleles (no recurring mutants) and stepwise mutation model are examined, and these agree well when n is moderate to large and k/n is close to one. As expected, the infinite alleles model underestimates t relative to the stepwise model. Using a modest number (20) of microsatellite markers is sufficient to obtain reasonably precise estimates of t for individuals separated by 200 or less generations. Hence, the multilocus haplotypes of two individuals can be used not only to date very deep ancestry but also rather recent ancestry as well. Finally, our results have forensic implications in that a complete match at all markers between a suspect and a sample excludes only a modest subset of the population unless a very large number of markers (textgreater500 microsatellites) are used. |
Philip Bohannon; Markus Jakobsson; Sukamol Srikwan: Cryptographic Approaches to Privacy in Forensic DNA Databases. Imai, Hideki ; Zheng, Yuliang (Ed.): Public Key Cryptography, (1751), pp. 373–390, Springer Berlin Heidelberg, 2000, ISBN: 978-3-540-66967-8, 978-3-540-46588-1. (Type: Incollection | Abstract | Links | BibTeX) @incollection{bohannon_cryptographic_2000, title = {Cryptographic Approaches to Privacy in Forensic DNA Databases}, author = { Philip Bohannon and Markus Jakobsson and Sukamol Srikwan}, editor = {Imai, Hideki and Zheng, Yuliang}, url = {http://link.springer.com/chapter/10.1007/978-3-540-46588-1_25}, isbn = {978-3-540-66967-8, 978-3-540-46588-1}, year = {2000}, date = {2000-01-01}, urldate = {2015-04-28}, booktitle = {Public Key Cryptography}, number = {1751}, pages = {373--390}, publisher = {Springer Berlin Heidelberg}, series = {Lecture Notes in Computer Science}, abstract = {Advances in DNA sequencing technology and human genetics are leading to the availability of inexpensive genetic tests, notably tests for individual predisposition to certain diseases. While such information is often valuable, its availability has raised serious concerns over the privacy of genetic information. These concerns are further heightened when genetic information is gathered into databases. We study access control for one class of such databases, forensic DNA databases, used to match unknown perpetrators against groups of potential suspects – usually convicted criminals. Our key observation is that for legitimate forensic queries, the sensitive information belonging to the target individual is already available to the querying agent in the form of a blood or tissue sample from a crime scene. We show how forensic DNA databases may be implemented so that only legitimate queries are feasible. In particular, a person with unlimited access to the database will be unable to extract information about any individual unless the necessary genetic information for that individual is already known. We develop a general solution framework, and show how to implement databases which handle certain cases of missing or incorrect DNA tests. Our framework and techniques are applicable to the general problem of encrypting information based on partially known or partially correct keys, and its security is based on standard cryptographic assumptions.}, keywords = {}, pubstate = {published}, tppubtype = {incollection} } Advances in DNA sequencing technology and human genetics are leading to the availability of inexpensive genetic tests, notably tests for individual predisposition to certain diseases. While such information is often valuable, its availability has raised serious concerns over the privacy of genetic information. These concerns are further heightened when genetic information is gathered into databases. We study access control for one class of such databases, forensic DNA databases, used to match unknown perpetrators against groups of potential suspects – usually convicted criminals. Our key observation is that for legitimate forensic queries, the sensitive information belonging to the target individual is already available to the querying agent in the form of a blood or tissue sample from a crime scene. We show how forensic DNA databases may be implemented so that only legitimate queries are feasible. In particular, a person with unlimited access to the database will be unable to extract information about any individual unless the necessary genetic information for that individual is already known. We develop a general solution framework, and show how to implement databases which handle certain cases of missing or incorrect DNA tests. Our framework and techniques are applicable to the general problem of encrypting information based on partially known or partially correct keys, and its security is based on standard cryptographic assumptions. |
Zhu, Hui; Liu, Xiaoxia; Lu, Rongxing; Li, Hui: Efficient and Privacy-Preserving Online Medical Pre-Diagnosis Framework Using Nonlinear SVM. 0000. (Type: Journal Article | BibTeX) @article{zhuefficient, title = {Efficient and Privacy-Preserving Online Medical Pre-Diagnosis Framework Using Nonlinear SVM}, author = {Zhu, Hui and Liu, Xiaoxia and Lu, Rongxing and Li, Hui}, publisher = {IEEE}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Johannes Buchmann (Technische Universität Darmstadt), Matthias Geihs (Technische Universität Darmstadt), Kay Hamacher (Technische Universität Darmstadt), Stefan Katzenbeisser (Technische Universität Darmstadt); Sebastian Stammler (Technische Universität Darmstadt): Long-Term Integrity Protection of Genomic Data.. GenoPri 2017, 0000. (Type: Journal Article | Abstract | BibTeX) @article{buchmann2017, title = {Long-Term Integrity Protection of Genomic Data.}, author = {Johannes Buchmann (Technische Universität Darmstadt), Matthias Geihs (Technische Universität Darmstadt), Kay Hamacher (Technische Universität Darmstadt), Stefan Katzenbeisser (Technische Universität Darmstadt) and Sebastian Stammler (Technische Universität Darmstadt)}, journal = {GenoPri 2017}, abstract = {Genomic data is crucial in the understanding of many dis- ease and guidance of medical treatments. Pharmacogenomics and cancer genomics are just two areas in precision medicine of rapidly growing uti- lization. At the same time, sequencing costs are plummeting below $ 1,000, meaning that a rapid growth in full-genome data storage requirements is foreseeable. While the privacy of genomic data processing is receiving growing atten- tion, long-term integrity protection of this highly sensitive and at the same time amply space-consuming data much less so. We present a scenario inspired by the future in pharmacogenomics, in which random parts of a patient’s genome are periodically accessed by authorized parties such as doctors and clinicians. A protection scheme is described that preserves integrity of the genomic data in that scenario over a time horizon of 100 years. During such a long time period, cryptographic schemes will potentially break and therefore our scheme allows to update the integrity protection. Furthermore, in- tegrity of parts of the genomic data can be verified while preserving the privacy of the rest of the data. Finally, a performance evaluation shows that privacy-preserving long-term integrity protection of genomic data is feasible.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Genomic data is crucial in the understanding of many dis- ease and guidance of medical treatments. Pharmacogenomics and cancer genomics are just two areas in precision medicine of rapidly growing uti- lization. At the same time, sequencing costs are plummeting below $ 1,000, meaning that a rapid growth in full-genome data storage requirements is foreseeable. While the privacy of genomic data processing is receiving growing atten- tion, long-term integrity protection of this highly sensitive and at the same time amply space-consuming data much less so. We present a scenario inspired by the future in pharmacogenomics, in which random parts of a patient’s genome are periodically accessed by authorized parties such as doctors and clinicians. A protection scheme is described that preserves integrity of the genomic data in that scenario over a time horizon of 100 years. During such a long time period, cryptographic schemes will potentially break and therefore our scheme allows to update the integrity protection. Furthermore, in- tegrity of parts of the genomic data can be verified while preserving the privacy of the rest of the data. Finally, a performance evaluation shows that privacy-preserving long-term integrity protection of genomic data is feasible. |