Research

Our research focuses on extracting useful information from human genome sequences using a combination of extremely large genomic data sets and sophisticated informatic approaches. We currently focus on four broad research areas:

Genome data aggregation
We lead an international consortium that has assembled gnomADgnomAD, the world’s largest single collection of sequencing data from human genome sequencing data, currently spanning 141,456 individuals. We’re mining this massive dataset for insights into human evolution, gene function, and disease risk.

Genomic approaches to rare disease diagnosis
We develop and apply genomic approaches (especially exome, whole-genome and transcriptome sequencing) and informatic methods to discover disease-causing mutations in severe disease patients, with a particular focus on neuromuscular diseases such as muscular dystrophy. Our tool xBrowse is an intuitive browser-based system for analyzing exome and genome data from rare disease families.

Loss-of-function variants
We have a longstanding research interest in the discovery of loss-of-function (LoF) variants – DNA changes that result in the complete obliteration of the normal function of a protein-coding gene – and their use in understanding gene function and human disease risk. We have developed improved annotation tools for identifying LoF variants, and participate in several large-scale international efforts to characterize the impact of these variants on human phenotype and disease risk.

Transcriptome sequencing and analysis
We use transcriptome sequencing (RNA-seq) approaches to better characterize the impact of DNA sequence variants on human gene function. As part of the GTEx Project we are exploring the impact of rare genetic variants on gene expression and splicing in the general population, as well as the biological mechanisms behind X chromosome inactivation in females. We also leverage RNA-seq from patient tissue to improve discovery of disease-causing mutations in rare diseases.

MacArthur Lab

Extracting useful information from large genomic datasets.

Research