U01HG012064
Cooperative Agreement
Overview
Grant Description
Predictive Modeling of the Functional and Phenotypic Impacts of Genetic Variants - Project Summary
Genome-wide association studies (GWAS) have associated tens of thousands of common variants with human diseases and traits. The rapid expansion of whole-genome sequencing (WGS) studies and biobanks offer great potential to understand the physiologic and pathophysiologic associations of both common and rare variants. The IGVF Consortium aims to systematically study the functional and phenotypic effects of genomic variation. However, it is not feasible to experimentally characterize the vast number of candidate variants of interest. Computational models that can accurately predict the context-specific effects of variants are essential in designing targeted research.
We propose an approach anchored on a framework of high-confidence regulatory elements (REs), from which we will develop methods to learn RE-gene links, perform rare variant association tests, and finemap causal common and rare variants. We aim to make all our results, methods, and tools available to the community through a public portal and the NHGRI and NHLBI Data Commons.
Our proposal has four aims:
(1) Develop a core framework of REs from open chromatin regions on which to anchor our models. We will improve on past approaches by producing higher-resolution predictions of functional base-pairs, producing novel RE subclassifications using functional characterization datasets from IGVF and other sources, and harnessing single-cell datasets to delineate lineage- and stimulus-specific elements.
(2) Use this framework to predict the roles of variants in molecular phenotypes, specifically gene expression and cellular response to stimuli. We will build statistical and machine-learning methods to predict context-specific links between REs and their target genes, using three-dimensional conformation data produced by the IGVF Consortium and external sources. We will apply this method across many cell types and perform feature selection to build a catalog of high-confidence RE-gene links and regulatory networks.
(3) Develop statistical methods to perform cell type-specific rare variant association tests (CellSTAAR) in WGS studies, and a latent variable model to prioritize candidate functional variants for traits and diseases, using results from aims 1 and 2. We will apply these methods to analyze various metabolic, immune-mediated, and psychiatric disorders in the multi-ethnic WGS data of the NHLBI Trans-Omic Precision Medicine Program (TOPMed) and the NHGRI Genome Sequencing Program (GSP) to identify candidate causal disease-associated variants.
(4) Make all the results publicly available by substantially expanding the FAVOR portal to include whole genome variant functional annotations of all three billion genomic positions as well as cell type-specific annotations. We will implement both FAVOR and CellSTAAR in the Data Commons AnVIL (NHGRI) and BioData Catalyst (NHLBI) so researchers may use them for analysis of new datasets in a scalable cloud computing environment. We will work closely with other centers and the Data Analysis Coordinating Center (DACC) of the IGVF on joint analyses and building the IGVF variant catalog.
Genome-wide association studies (GWAS) have associated tens of thousands of common variants with human diseases and traits. The rapid expansion of whole-genome sequencing (WGS) studies and biobanks offer great potential to understand the physiologic and pathophysiologic associations of both common and rare variants. The IGVF Consortium aims to systematically study the functional and phenotypic effects of genomic variation. However, it is not feasible to experimentally characterize the vast number of candidate variants of interest. Computational models that can accurately predict the context-specific effects of variants are essential in designing targeted research.
We propose an approach anchored on a framework of high-confidence regulatory elements (REs), from which we will develop methods to learn RE-gene links, perform rare variant association tests, and finemap causal common and rare variants. We aim to make all our results, methods, and tools available to the community through a public portal and the NHGRI and NHLBI Data Commons.
Our proposal has four aims:
(1) Develop a core framework of REs from open chromatin regions on which to anchor our models. We will improve on past approaches by producing higher-resolution predictions of functional base-pairs, producing novel RE subclassifications using functional characterization datasets from IGVF and other sources, and harnessing single-cell datasets to delineate lineage- and stimulus-specific elements.
(2) Use this framework to predict the roles of variants in molecular phenotypes, specifically gene expression and cellular response to stimuli. We will build statistical and machine-learning methods to predict context-specific links between REs and their target genes, using three-dimensional conformation data produced by the IGVF Consortium and external sources. We will apply this method across many cell types and perform feature selection to build a catalog of high-confidence RE-gene links and regulatory networks.
(3) Develop statistical methods to perform cell type-specific rare variant association tests (CellSTAAR) in WGS studies, and a latent variable model to prioritize candidate functional variants for traits and diseases, using results from aims 1 and 2. We will apply these methods to analyze various metabolic, immune-mediated, and psychiatric disorders in the multi-ethnic WGS data of the NHLBI Trans-Omic Precision Medicine Program (TOPMed) and the NHGRI Genome Sequencing Program (GSP) to identify candidate causal disease-associated variants.
(4) Make all the results publicly available by substantially expanding the FAVOR portal to include whole genome variant functional annotations of all three billion genomic positions as well as cell type-specific annotations. We will implement both FAVOR and CellSTAAR in the Data Commons AnVIL (NHGRI) and BioData Catalyst (NHLBI) so researchers may use them for analysis of new datasets in a scalable cloud computing environment. We will work closely with other centers and the Data Analysis Coordinating Center (DACC) of the IGVF on joint analyses and building the IGVF variant catalog.
Funding Goals
NHGRI SUPPORTS THE DEVELOPMENT OF RESOURCES AND TECHNOLOGIES THAT WILL ACCELERATE GENOME RESEARCH AND ITS APPLICATION TO HUMAN HEALTH AND GENOMIC MEDICINE. A CRITICAL PART OF THE NHGRI MISSION CONTINUES TO BE THE STUDY OF THE ETHICAL, LEGAL AND SOCIAL IMPLICATIONS (ELSI) OF GENOME RESEARCH. NHGRI ALSO SUPPORTS THE TRAINING AND CAREER DEVELOPMENT OF INVESTIGATORS AND THE DISSEMINATION OF GENOME INFORMATION TO THE PUBLIC AND TO HEALTH PROFESSIONALS. THE SMALL BUSINESS INNOVATION RESEARCH (SBIR) PROGRAM IS USED TO INCREASE PRIVATE SECTOR COMMERCIALIZATION OF INNOVATIONS DERIVED FROM FEDERAL RESEARCH AND DEVELOPMENT, TO INCREASE SMALL BUSINESS PARTICIPATION IN FEDERAL RESEARCH AND DEVELOPMENT, AND TO FOSTER AND ENCOURAGE PARTICIPATION OF SOCIALLY AND ECONOMICALLY DISADVANTAGED SMALL BUSINESS CONCERNS AND WOMEN-OWNED SMALL BUSINESS CONCERNS IN TECHNOLOGICAL INNOVATION. THE SMALL BUSINESS TECHNOLOGY TRANSFER (STTR) PROGRAM IS USED TO FOSTER SCIENTIFIC AND TECHNOLOGICAL INNOVATION THROUGH COOPERATIVE RESEARCH AND DEVELOPMENT CARRIED OUT BETWEEN SMALL BUSINESS CONCERNS AND RESEARCH INSTITUTIONS, TO FOSTER TECHNOLOGY TRANSFER BETWEEN SMALL BUSINESS CONCERNS AND RESEARCH INSTITUTIONS, TO INCREASE PRIVATE SECTOR COMMERCIALIZATION OF INNOVATIONS DERIVED FROM FEDERAL RESEARCH AND DEVELOPMENT, AND TO FOSTER AND ENCOURAGE PARTICIPATION OF SOCIALLY AND ECONOMICALLY DISADVANTAGED SMALL BUSINESS CONCERNS AND WOMEN-OWNED SMALL BUSINESS CONCERNS IN TECHNOLOGICAL INNOVATION.
Grant Program (CFDA)
Awarding / Funding Agency
Place of Performance
Worcester,
Massachusetts
016052324
United States
Geographic Scope
Single Zip Code
Related Opportunity
Analysis Notes
Amendment Since initial award the total obligations have increased 821% from $367,902 to $3,386,874.
University Of Massachusetts Medical School was awarded
Predictive Modeling of the Functional and Phenotypic Impacts of Genetic Variants
Cooperative Agreement U01HG012064
worth $3,386,874
from National Human Genome Research Institute in August 2021 with work to be completed primarily in Worcester Massachusetts United States.
The grant
has a duration of 4 years 9 months and
was awarded through assistance program 93.172 Human Genome Research.
The Cooperative Agreement was awarded through grant opportunity Developing Predictive Models of the Impact of Genomic Variation on Function (U01 Clinical Trial Not Allowed).
Status
(Ongoing)
Last Modified 9/24/25
Period of Performance
8/20/21
Start Date
5/31/26
End Date
Funding Split
$3.4M
Federal Obligation
$0.0
Non-Federal Obligation
$3.4M
Total Obligated
Activity Timeline
Subgrant Awards
Disclosed subgrants for U01HG012064
Transaction History
Modifications to U01HG012064
Additional Detail
Award ID FAIN
U01HG012064
SAI Number
U01HG012064-1813829744
Award ID URI
SAI UNAVAILABLE
Awardee Classifications
Public/State Controlled Institution Of Higher Education
Awarding Office
75N400 NIH National Human Genome Research Institute
Funding Office
75N400 NIH National Human Genome Research Institute
Awardee UEI
MQE2JHHJW9Q8
Awardee CAGE
6R004
Performance District
MA-02
Senators
Edward Markey
Elizabeth Warren
Elizabeth Warren
Budget Funding
| Federal Account | Budget Subfunction | Object Class | Total | Percentage |
|---|---|---|---|---|
| National Human Genome Research Institute, National Institutes of Health, Health and Human Services (075-0891) | Health research and training | Grants, subsidies, and contributions (41.0) | $1,504,776 | 100% |
Modified: 9/24/25