Executive summary: the NIH is seeking comments on a new proposed policy on genomic data sharing. While there is much to like about the new policy, we are very concerned about the proposed requirement for a click-through agreement on all aggregate genomic resources (which would include heavily-used databases such as ExAC and gnomAD). Our draft response to the Request for Comments is below. If you agree with our concern, please consider replying to the Request for Comments yourself, using the template text at the end of this post if useful.
Today we are celebrating the official publication of the Exome Aggregation Consortium (ExAC) paper in Nature – marking the end of a phase in this project that has involved most of the members of my lab (and many, many others beyond) for a large chunk of the last few years. This official publication is an opportune time to reflect on how ExAC came to be, and the impact it’s had on us and the wider community.
First, some background
Exome sequencing is a very cost-effective approach that allows us to look with high resolution at just the 1-2% of the human genome that codes for protein – these are the parts we understand the best, and also the parts where the vast majority of severe disease-causing mutations are found. Because exome sequencing is so powerful it’s been applied to tens of thousands of patients with rare, severe diseases such as muscular dystrophy and epilepsy. However, a key challenge when sequencing patients is that everyone carries tens of thousands of genetic changes, and we need a database of “normal” variation that tells us which of those changes are seen in healthy people, and how common they are.
We have a new paper out today in Science Translational Medicine that describes our application of the massive Exome Aggregation Consortium database to understanding the variation in one specific gene: the PRNP gene, which encodes the prion protein.
This project was a special one for a number of reasons. Firstly, there’s an incredibly strong personal motivation behind this work, which you can read much more about in a blog post by lead author Eric Minikel. Secondly, it’s a clear demonstration of the way in which we can use large-scale reference databases to interpret genetic variation, including flagging some variants as non-causal or having mild effects. Thirdly, as discussed in the accompanying perspective by Robert Green and colleagues, this work is already having clinical impact by changing the diagnosis for people with families affected by prion disease. And finally, the discovery of “knockout” variants in PRNP in healthy individuals is tantalizing evidence that inhibiting this gene in mutation carriers is likely to be a safe therapeutic approach.
The paper is of course open access, so you can read the details yourself. Huge congratulations to Eric for pulling this paper together!
New DNA sequencing technologies are rapidly transforming the diagnosis of rare genetic diseases, but they also carry a risk: by allowing us to see all of the hundreds of “interesting-looking” variants in a patient’s genome, they make it potentially easy for researchers to spin a causal narrative around genetic changes that have nothing to do with disease status. Such false positive reports can have serious consequences: incorrect diagnoses, unnecessary or ineffective treatment, and reproductive decisions (such as embryo termination) based on spurious test results. In order to minimize such outcomes the field needs to decide on clear statistical guidelines for deciding whether or not a variant is truly causally linked with disease.
In a paper in Nature this week we report the consensus statement from a workshop sponsored by the National Human Genome Research Institute, on establishing guidelines for assessing the evidence for variant causality. Continue reading Guidelines for finding genetic variants underlying human disease