U24HG007234
Cooperative Agreement
Overview
Grant Description
Gencode: Comprehensive Reference Genome Annotation for Human and Mouse - Project Summary
The Gencode Consortium creates foundational reference genome annotation for the human and mouse genomes. All features are identified and classified with high accuracy based on biological evidence, and then freely released for the benefit of biomedical research and genome interpretation.
Gencode seeks to create annotation that increases the understanding of genome function in both human and mouse. It prioritizes human disease genes and respects the role of mouse as the major mammalian model organism.
To effectively annotate genomes, Gencode has created a suite of tools and draws on deep expertise across its partners across four fundamental components:
1) A comprehensive gene annotation pipeline leveraging manual and computational annotation.
2) A set of computational methods to evaluate and enhance gene annotation.
3) Experimental pipelines targeted to expressed sequences less detectable in standard protocols.
4) A machine learning capacity to improve all facets of the project.
Gencode will maintain a major focus on protein-coding and non-coding loci, including their alternatively spliced isoforms and pseudogenes. It will also extend expert manual review to small non-coding RNAs (ncRNA) and the annotation of non-polyadenylated transcripts.
Additionally, Gencode will expand regulatory annotation to a defined set of gene-associated features to more accurately reflect the interconnections between regulatory regions, including those with transcribed sequences such as ncRNA, and overall transcriptional output.
Gencode will take advantage of the increasing maturity of genomics technology, including long-read transcriptome sequencing, functional genomics assays, and graph-based genome representations. This will help identify features such as genes, pseudogenes, exons, and splice sites that are incorrect, incomplete, or in genome regions simply not present in the current reference assembly.
More specifically, in the next four years, Gencode plans to:
1) Extend its human and mouse gene sets to as near completion as possible given available data and current experimental technology.
2) Leverage new, high-quality human genome assemblies and targeted transcriptomic data to expand representation so that more human haplotypes will have high-quality annotation.
3) Annotate gene-associated regulatory regions, including enhancer-promoter connections.
4) Collaborate with other resources to ensure a consistent representation of genic and regulatory features and reference transcripts for reporting clinical variation.
5) Distribute Gencode annotations and engage with community annotation efforts to ensure accuracy and consistency.
Primary Gencode data will continue to be available from the Ensembl and UCSC Genome Browsers and the Gencode website. The consortium will also develop new mechanisms for effective two-way outreach, training, and communication with the community. The long-term aim is to establish Gencode as the standard annotation set for research and clinical genomics applications.
The Gencode Consortium creates foundational reference genome annotation for the human and mouse genomes. All features are identified and classified with high accuracy based on biological evidence, and then freely released for the benefit of biomedical research and genome interpretation.
Gencode seeks to create annotation that increases the understanding of genome function in both human and mouse. It prioritizes human disease genes and respects the role of mouse as the major mammalian model organism.
To effectively annotate genomes, Gencode has created a suite of tools and draws on deep expertise across its partners across four fundamental components:
1) A comprehensive gene annotation pipeline leveraging manual and computational annotation.
2) A set of computational methods to evaluate and enhance gene annotation.
3) Experimental pipelines targeted to expressed sequences less detectable in standard protocols.
4) A machine learning capacity to improve all facets of the project.
Gencode will maintain a major focus on protein-coding and non-coding loci, including their alternatively spliced isoforms and pseudogenes. It will also extend expert manual review to small non-coding RNAs (ncRNA) and the annotation of non-polyadenylated transcripts.
Additionally, Gencode will expand regulatory annotation to a defined set of gene-associated features to more accurately reflect the interconnections between regulatory regions, including those with transcribed sequences such as ncRNA, and overall transcriptional output.
Gencode will take advantage of the increasing maturity of genomics technology, including long-read transcriptome sequencing, functional genomics assays, and graph-based genome representations. This will help identify features such as genes, pseudogenes, exons, and splice sites that are incorrect, incomplete, or in genome regions simply not present in the current reference assembly.
More specifically, in the next four years, Gencode plans to:
1) Extend its human and mouse gene sets to as near completion as possible given available data and current experimental technology.
2) Leverage new, high-quality human genome assemblies and targeted transcriptomic data to expand representation so that more human haplotypes will have high-quality annotation.
3) Annotate gene-associated regulatory regions, including enhancer-promoter connections.
4) Collaborate with other resources to ensure a consistent representation of genic and regulatory features and reference transcripts for reporting clinical variation.
5) Distribute Gencode annotations and engage with community annotation efforts to ensure accuracy and consistency.
Primary Gencode data will continue to be available from the Ensembl and UCSC Genome Browsers and the Gencode website. The consortium will also develop new mechanisms for effective two-way outreach, training, and communication with the community. The long-term aim is to establish Gencode as the standard annotation set for research and clinical genomics applications.
Funding Goals
NOT APPLICABLE
Grant Program (CFDA)
Awarding / Funding Agency
Place of Performance
United Kingdom
Geographic Scope
Foreign
Related Opportunity
Analysis Notes
Amendment Since initial award the total obligations have increased 325% from $2,973,167 to $12,649,437.
European Molecular Biology Laboratory was awarded
GENCODE: comprehensive reference genome annotation for human and mouse
Cooperative Agreement U24HG007234
worth $12,649,437
from National Human Genome Research Institute in April 2013 with work to be completed primarily in United Kingdom.
The grant
has a duration of 12 years 2 months and
was awarded through assistance program 93.172 Human Genome Research.
The Cooperative Agreement was awarded through grant opportunity Genomic Community Resources (U24 Clinical Trial Not Allowed).
Status
(Complete)
Last Modified 9/20/24
Period of Performance
4/1/13
Start Date
6/30/25
End Date
Funding Split
$12.6M
Federal Obligation
$0.0
Non-Federal Obligation
$12.6M
Total Obligated
Activity Timeline
Subgrant Awards
Disclosed subgrants for U24HG007234
Transaction History
Modifications to U24HG007234
Additional Detail
Award ID FAIN
U24HG007234
SAI Number
U24HG007234-519080533
Award ID URI
SAI UNAVAILABLE
Awardee Classifications
Non-Domestic (Non-U.S.) Entity
Awarding Office
75N400 NIH NATIONAL HUMAN GENOME RESEARCH INSTITUTE
Funding Office
75N400 NIH NATIONAL HUMAN GENOME RESEARCH INSTITUTE
Awardee UEI
KZD5S45YZ4A4
Awardee CAGE
DH518
Performance District
Not Applicable
Budget Funding
Federal Account | Budget Subfunction | Object Class | Total | Percentage |
---|---|---|---|---|
National Human Genome Research Institute, National Institutes of Health, Health and Human Services (075-0891) | Health research and training | Grants, subsidies, and contributions (41.0) | $7,247,886 | 100% |
Modified: 9/20/24