Jobs

Job Openings


About the MacArthur Lab

We are a tight-knit research group jointly based as Massachusetts General Hospital and the Broad Institute of Harvard and MIT, and leveraging the largest genomic data sets in the world and cutting-edge analysis methods to make sense of human genetic variation. We’re committed to open data and open-source code, as well as experimenting with new methods of communication. Working with us is a chance to learn from experts in computational biology, large-scale genomics, variant interpretation, software development and clinical genomics, as well as to make a difference to the lives of hundreds of families affected by rare diseases.


Associate Computational Biologist II

We are seeking a creative and self-motivated candidate to play a key role in developing and writing fast, automated, open-source computational pipelines to produce high-quality public data releases for forthcoming — and exponentially growing — datasets in gnomAD. The role will involve close collaboration with senior staff scientists and postdoctoral trainees in the group to develop novel approaches for quality control and analysis of our highly heterogeneous datasets at scale. The candidate will also manage a diverse portfolio of human genetic datasets generated within the lab pertaining to rare disease research. As this role involves collaboration with a wide variety of staff across disciplines, including computational scientists, academic trainees, software engineers, biologists, and clinical geneticists, we are specifically looking for a candidate who works well in teams.

As part of the methods development team, you will have the opportunity to make substantial contributions to high-impact projects with direct implications for clinical practice, as well as to participate in the vibrant research environment at the Broad Institute, with its close links to MIT, Harvard, and the Harvard-affiliated hospitals across Boston. You will have access to data sets of exceptional scale and to colleagues with deep expertise in genetics, computational biology, software development, and machine learning.

Key to our success is growing a strong team with a diverse membership who foster a culture of continual learning, and who support the growth and success of one another. Towards this end, we are committed to seeking applications from women and from underrepresented groups. We know that many excellent candidates choose not to apply despite their capabilities; please allow us to enthusiastically counter this tendency. If you are a computational scientist who is eager to grow professionally and to contribute to our team culture and to participate in high-impact, open science, then we encourage you to apply.

Responsibilities for the role include:

    • Working closely with a senior-level computational scientist to develop, design, and execute elegant and efficient computational methods and pipelines necessary to create gnomAD releases (e.g., sample and variant quality-control workflows
    • Producing data releases to meet regular deadlines
    • Writing pipeline code in Python and Hail with a focus/emphasis on correctness, concision, readability, re-usability, robustness to scale, speed, and automation/ease of use
    • Identifying and resolving bugs in pipeline code
    • Implementing, benchmarking, and optimizing existing computational algorithms to meet production objectives
    • Writing new algorithms to handle data in cases where no satisfactory methods exist
    • Regularly contributing code to our public GitHub repository and participating in team-based code review
    • Managing and tracking a high volume of genetic datasets that are generated within the laboratory
    • Contributing code, figures, and text to team presentations and publications
    • Presenting work at regular intervals at lab and project team meetings, as well as institute, national, and/or international conferences as appropriate

Characteristics and Qualifications:

The role will require an independent and highly motivated candidate with the ambition to develop and contribute to a significant and sophisticated body of code that is used on a regular basis to produce large, public data releases with a highly active and invested user community.

You will have an interest in developing domain expertise in computational methods for analyzing next-generation sequencing data, as well as an interest in the technical aspects of deploying these methods at scale.

We are looking for someone who:

    • Is able to write clean, efficient, robust, and usable code, with demonstrated proficiency in one of the following: Linux, Perl, Python, Java, C++, Matlab, or R, with a strong preference towards Linux, Python, and R
    • Has a BS or MS degree and 0-2 years experience in computer science, engineering, physics, mathematics, statistics, biology, or related fields
    • Has demonstrated experience in quantitative (statistical, mathematical, computational) research with large data sets; skill and experience with statistical analysis and/or computational biology is strongly preferred — with special consideration for individuals with prior experience using Hail
    • Has fluency with human genetics and next-generation sequencing data; ideally will have prior experience with the quality control of such datasets
    • Exhibits strong initiative and the ability to take ownership of assigned tasks and projects
    • Cares passionately about the quality of his/her work and demonstrates zealous attention to detail
    • Listens, communicates, and collaborates well with team members, clinicians, software developers, and research scientists
    • Demonstrates excellent written and oral presentation skills
    • Manages time well and is able to respond to shifting priorities in a fast-paced and rapidly changing environment

Please apply via Broad Institute careers site.

Computational Biologist I

We are seeking an ambitious, creative, self-motivated, PhD-level candidate to play a leadership role in designing and developing fast, automated, open-source computational pipelines to produce high-quality public data releases for forthcoming — and exponentially growing — datasets in gnomAD. The role will involve close collaboration with postdoctoral scientists in the group to develop novel approaches for quality control and analysis of our highly heterogeneous datasets at scale, as well as the supervision of an associate (junior-level) computational scientist in the group to work on this key project. The candidate will also interact with Hail developers at the Broad to play a role in the feature design of the field’s most cutting-edge toolkit (https://hail.is) for massively parallel, high-throughput computation of genetic data. As this role involves collaboration with a wide variety of staff across disciplines, including computational scientists, academic trainees, software engineers, biologists, and clinical geneticists, we are specifically looking for a candidate who works well in teams.

As part of the methods development team, you will have the opportunity to make substantial contributions to high-impact projects with direct implications for clinical practice, as well as to participate in the vibrant research environment at the Broad Institute, with its close links to MIT, Harvard, and the Harvard-affiliated hospitals across Boston. You will have access to data sets of exceptional scale and to colleagues with deep expertise in genetics, computational biology, software development, and machine learning.

Key to our success is growing a strong team with a diverse membership who foster a culture of continual learning, and who support the growth and success of one another. Towards this end, we are committed to seeking applications from women and from underrepresented groups. We know that many excellent candidates choose not to apply despite their capabilities; please allow us to enthusiastically counter this tendency. If you are a computational scientist who is eager to grow professionally and to contribute to our team culture and to participate in high-impact, open science, then we encourage you to apply.

Individual responsibilities for the role include:

    • Collaborating with postdoctoral scientists to develop, design, and execute elegant and efficient computational methods and pipelines necessary to create gnomAD releases (e.g., sample and variant quality-control workflows)
    • Managing production cycles to meet hard data release deadlines
    • Writing pipeline code in Python and Hail with a focus/emphasis on correctness, concision, readability, re-usability, robustness to scale, speed, and automation/ease of use
    • Setting and reinforcing standards for style across multiple contributors to the gnomAD codebase; participating in peer review of code through pull requests
    • Identifying and resolving bugs in pipeline code
    • Monitoring and assessing current, relevant scientific literature related to the group’s analysis aims, in order to understand emerging practices and to ensure the group continues to employ optimal methods
    • Initiating consultations with scientists, software engineers, and mathematicians within and external to the Broad with relevant specific expertise in the course of evaluating existing methods and developing novel ones
    • Designing clear, appealing, and accessible analysis metrics and visuals to assess call set quality and to benchmark new releases against historic releases; and implementing an automatic workflow for producing such reports
    • Collaborating with postdoctoral scientists and browser software engineers to organize and structure gnomAD data and code into sensible and accessible formats, including call sets, sample and variant annotations, and pipeline code
    • Communicating work to the wider scientific community through group meetings within the lab and at the Broad; through scientific publication; open-source code publication; and presentations/posters at institute, national, and/or international conferences as appropriate

Management responsibilities for the role include:

    • Supervising and mentoring an associate (junior-level) computational staff scientist assigned to work on gnomAD releases, including weekly check-ins, quarterly performance reviews, and discussions on career development
    • Overseeing project management for gnomAD releases, managing progress towards completion by setting concrete objectives and tasks, professional standards, and expectations for associate staff scientist; by helping him/her prioritize tasks; by troubleshooting technical issues, suggesting resources (people and tools), and managing relationships with collaborators
    • Handling general supervisory/HR administrative tasks for associate staff, including approving vacations and expense reports, writing annual performance reviews, and making recommendations for promotions and salary increases

Characteristics and Qualifications:

The role will require an independent and highly motivated candidate with the ambition to maintain and develop a significant and sophisticated body of code that is used on a regular basis to produce large, public data releases with a highly active and invested user community.

You will have domain expertise in computational methods for analyzing next-generation sequencing data, as well as an interest in the technical aspects of deploying these methods at scale.

We are looking for someone who:

    • Is able to write clean, efficient, robust, and usable code, with demonstrated proficiency in one of the following: Unix/Linux, Perl, Python, Java, C++, Matlab, or R, with a strong preference for Unix/Linux, Python, and R
    • Has a Ph.D. in mathematics, computer science, engineering, physics, mathematics, statistics, biology, or another related field; or equivalent professional experience
    • Has demonstrated experience in quantitative (statistical, mathematical, computational) research with large data sets; skill and experience with statistical analysis and/or computational biology is strongly preferred — with special consideration for individuals with prior experience using Hail
    • Has fluency with human genetics and next-generation sequencing data; ideally will have prior experience with the quality control of such datasets
    • Exhibits strong initiative and the ability to take ownership of complex projects and the management and development of a team
    • Cares passionately about the quality of his/her work and demonstrates zealous attention to detail
    • Is familiar with Git and modern team-based software development practices, including peer code review through pull requests
    • Listens, communicates, and collaborates well with team members, clinicians, software developers, and research scientists
    • Demonstrates excellent written and oral presentation skills
    • Manages time well and is able to respond to shifting priorities in a fast-paced and rapidly changing environment

Please apply via Broad Institute careers site.

Software Engineer: Genome Aggregation Database

We are seeking a creative, self-motivated software engineer with a passion for visual information design. The candidate will work closely with a team of scientists to create web applications for exploring complex genetic datasets. More specifically, we will be carrying out an analysis of the largest collection of human genomes and exomes ever assembled, with the goal of understanding how DNA sequence variation relates to human physiology, traits, and susceptibility to disease. Our catalogue currently contains about half a billion genetic variants found in two hundred thousand people of diverse ancestry, and we expect that this resource will grow to surpass one million sequenced humans over the next three years.

In this role, you will join a world-class research institute and work closely with a tightly knit team of scientists, software engineers, computational biologists, medical doctors, and geneticists. Your code will be 100% open source, and you will play a critical part in making our results accessible to the larger scientific community. To date, our resources have been used by tens of thousands of researchers and physicians around the world who strive to understand the molecular basis for disease.

Key to our success is growing a strong team with a diverse membership who foster a culture of continual learning, and who support the growth and success of one another. Towards this end, we are committed to seeking applications from women and from underrepresented groups. We know that many excellent candidates choose not to apply despite their capabilities; please allow us to enthusiastically counter this tendency. If you are a software engineer who is eager to grow professionally and to contribute to our team culture and to participate in high-impact, open science, then we encourage you to apply.

Responsibilities for this role include:

    • Collaborating closely with researchers to develop performant web applications for effectively displaying the results of our analysis.
    • Participating in the design process.
    • Engineering a React component library that will be reused in multiple genomics projects across the Broad Institute.
    • Creating elegant APIs and efficient NoSQL database queries.
    • Manipulating large datasets using distributed cloud technologies.
    • Deploying web applications using container orchestration.

Characteristics and Qualifications:

    • Ability to write clean, modern Javascript and Python.
    • Experience developing web applications using a popular Javascript framework.
    • Comfortable working with some of the following technologies: React, Redux, CSS, D3, GraphQL, Elasticsearch, Node.js, Python, Hail, Docker, Kubernetes, Google Cloud Platform.
    • Excited to learn new domains of knowledge.
    • Familiar with Git and modern team-based software development practices, including peer code review through pull requests.
    • Eager to maintain an open-source project and respond promptly to user issues or inquiries.
    • Manages time well and is able to respond to shifting priorities in a fast-paced and rapidly changing environment.
    • Actively engages with team members to develop and refine software.
    • If possible, the candidate should provide a link to an online portfolio, open source project, or web application they have contributed to in their cover letter.

Please apply via Broad Institute careers site.

Software Engineer: Rare Disease Genomics

Join a team that’s building open-source web-based decision support tools to dramatically accelerate the pace of diagnosis for families affected by rare genetic conditions. Our platform seqr (github.com/macarthur-lab/seqr) is used by an international consortium of collaborating clinicians, researchers, and industry partners, and significantly improves their ability to search through large genetic datasets and make discoveries and diagnoses. seqr is core to our efforts both in the Rare Genomes Project and the Broad Center of Mendelian Genomics, and has already enabled us to provide genetic diagnoses to more than 1,000 rare disease families. We are now looking for a full-stack software engineer that will help with the next phase of this project.

Your primary responsibilities will include designing and implementing new features. This may include developing intuitive visualizations and user interfaces (using javascript, React.js), integrating new kinds of datasets, and designing new APIs and database schemas (in python and Django, elasticsearch and PostgreSQL). We welcome applications from diverse backgrounds and experience levels. Knowledge of genetics or biology is preferred but not required. Above all, the ideal candidate will have strong engineering skills, a desire to learn new domains, and the enthusiasm and ability to contribute to multiple aspects of seqr development.

Requirements:

    • BS or MS degree in Computer Science or other scientific discipline.
    • Experience delivering clear, well-designed software.
    • Interest in working with a wide variety of technologies and on diverse problems.
    • Excellent communication skills and ability to work with users.
    • Experience with web-based application development in python and javascript is required.
    • Experience with Django, React.js, Redux, SQL, elasticsearch, Google Cloud Platform and Kubernetes is preferred.
    • Familiarity with genomics and DNA sequencing data analysis is a plus.

Please apply via Broad Institute careers site.


Post-doctoral Fellows

We are looking for postdoc candidates with backgrounds in computational genomics or statistical genetics, ideally with direct experience in analyzing human sequencing data. Projects include identifying human knockouts, leveraging large human genetic data sets for drug target discovery, and improving the diagnosis of rare disease patients. Most importantly, we’re looking for people who are passionate about the translation of genomics into clinical practice, and have the right personality to work in a fast-paced and highly collaborative environment.

To apply, email Daniel with your CV.