Our research focuses on extracting useful information from human genome sequences using a combination of extremely large genomic data sets and sophisticated informatic approaches. We currently focus on four broad research areas:

Exome aggregation
We have assembled and reprocessed the world’s largest collection of human exome data, providing unprecedented resolution of the patterns of genetic variation in human protein-coding genes. We are currently mining a combined data set from over 80,000 individuals for insight into human evolution, gene function, and disease gene identification.

Genomic approaches to rare disease diagnosis
We develop and apply genomic approaches (especially exome, whole-genome and transcriptome sequencing) and informatic methods to discover disease-causing mutations in severe disease patients, with a particular focus on neuromuscular diseases such as muscular dystrophy. Our tool xBrowse is an intuitive browser-based system for analyzing exome and genome data from rare disease families.

Loss-of-function variants
We have a longstanding research interest in the discovery of loss-of-function (LoF) variants – DNA changes that result in the complete obliteration of the normal function of a protein-coding gene – and their use in understanding gene function and human disease risk. We have developed improved annotation tools for identifying LoF variants, and participate in several large-scale international efforts to characterize the impact of these variants on human phenotype and disease risk.

Transcriptome sequencing and analysis
We use transcriptome sequencing (RNA-seq) approaches to better characterize the impact of DNA sequence variants on human gene function. As part of the GTEx Project we are exploring the impact of rare genetic variants on gene expression and splicing in the general population, as well as the biological mechanisms behind X chromosome inactivation in females. We also leverage RNA-seq from patient tissue to improve discovery of disease-causing mutations in rare diseases.