U01CA253481
Cooperative Agreement
Overview
Grant Description
Integrative Genomic and Epigenomic Analysis of Cancer Using Long Read Sequencing - Project Summary
The last twenty years have experienced extensive growth in the sequencing of cancer genomes, leading to a dramatically increased understanding of the role of genetic and epigenetic mutations in cancer. This has largely been enabled by developments in high-throughput "second-generation" sequencing technology and analysis that characterize cancer genomes using short-reads.
Recently, a new generation of high-throughput long-read sequencing instruments, primarily from Pacific Biosciences and Oxford Nanopore, have become available that are poised to displace short-read sequencing for many applications. We and others have used these technologies to discover tens of thousands of variants per cancer genome that are not detectable using short-reads, including structural variants and differentially methylated regions in known oncogenes and cancer risk genes.
These technologies carry the potential to address many open questions in cancer biology. However, the analysis of long-read sequencing data is computationally demanding and needs specialized algorithms that are either too inefficient to use at scale or do not yet exist.
In this proposal, we will address several gaps in the application of long-read technology for basic research and clinical use in cancer genomics. First, we will develop improved methods for finding structural variants and complex repeat expansions from long-reads, both of which are major diagnostic and prognostic indicators of disease, yet are not accurately identified using existing methods.
Leveraging the improved phasing capabilities of long reads, this work will include the detection of mosaic variants, revealing tumor heterogeneity and variants in precancerous tissues.
Next, we will apply machine learning and systems level advances to accelerate and improve the comparison of variants across large patient cohorts. Critically, this will compensate for the error-prone nature of single molecule long-read sequencing to make these comparisons more accurate when comparing tumor-normal samples or pedigrees of related patients so that recurrent driving mutations can be accurately identified.
Finally, we will develop integrative methods for the joint analysis of genome, transcriptome, and epigenetic profiling of cancer genomes. These advances will improve the identification of fusion genes and allow for entirely new forms of epigenetic analysis, such as the allele-specific analysis of methylation across transposable elements and other repetitive elements.
Synthesizing the many thousands of novel variants we will detect using our methods, we will then develop algorithms that will identify and evaluate recurrent genetic or epigenetic variations as putative driving mutations. All methods will be released open-source and will empower us, our ITCR collaborators, and the cancer genomics community at large to study genetic and epigenetic variants with near perfect accuracy and thereby unlock many new associations to treatment and disease.
The last twenty years have experienced extensive growth in the sequencing of cancer genomes, leading to a dramatically increased understanding of the role of genetic and epigenetic mutations in cancer. This has largely been enabled by developments in high-throughput "second-generation" sequencing technology and analysis that characterize cancer genomes using short-reads.
Recently, a new generation of high-throughput long-read sequencing instruments, primarily from Pacific Biosciences and Oxford Nanopore, have become available that are poised to displace short-read sequencing for many applications. We and others have used these technologies to discover tens of thousands of variants per cancer genome that are not detectable using short-reads, including structural variants and differentially methylated regions in known oncogenes and cancer risk genes.
These technologies carry the potential to address many open questions in cancer biology. However, the analysis of long-read sequencing data is computationally demanding and needs specialized algorithms that are either too inefficient to use at scale or do not yet exist.
In this proposal, we will address several gaps in the application of long-read technology for basic research and clinical use in cancer genomics. First, we will develop improved methods for finding structural variants and complex repeat expansions from long-reads, both of which are major diagnostic and prognostic indicators of disease, yet are not accurately identified using existing methods.
Leveraging the improved phasing capabilities of long reads, this work will include the detection of mosaic variants, revealing tumor heterogeneity and variants in precancerous tissues.
Next, we will apply machine learning and systems level advances to accelerate and improve the comparison of variants across large patient cohorts. Critically, this will compensate for the error-prone nature of single molecule long-read sequencing to make these comparisons more accurate when comparing tumor-normal samples or pedigrees of related patients so that recurrent driving mutations can be accurately identified.
Finally, we will develop integrative methods for the joint analysis of genome, transcriptome, and epigenetic profiling of cancer genomes. These advances will improve the identification of fusion genes and allow for entirely new forms of epigenetic analysis, such as the allele-specific analysis of methylation across transposable elements and other repetitive elements.
Synthesizing the many thousands of novel variants we will detect using our methods, we will then develop algorithms that will identify and evaluate recurrent genetic or epigenetic variations as putative driving mutations. All methods will be released open-source and will empower us, our ITCR collaborators, and the cancer genomics community at large to study genetic and epigenetic variants with near perfect accuracy and thereby unlock many new associations to treatment and disease.
Awardee
Funding Goals
TO PROVIDE FUNDAMENTAL INFORMATION ON THE CAUSE AND NATURE OF CANCER IN PEOPLE, WITH THE EXPECTATION THAT THIS WILL RESULT IN BETTER METHODS OF PREVENTION, DETECTION AND DIAGNOSIS, AND TREATMENT OF NEOPLASTIC DISEASES. CANCER BIOLOGY RESEARCH INCLUDES THE FOLLOWING RESEARCH PROGRAMS: CANCER CELL BIOLOGY, CANCER IMMUNOLOGY, HEMATOLOGY AND ETIOLOGY, DNA AND CHROMOSOMAL ABERRATIONS, TUMOR BIOLOGY AND METASTASIS, AND STRUCTURAL BIOLOGY AND MOLECULAR APPLICATIONS.
Grant Program (CFDA)
Awarding / Funding Agency
Place of Performance
Baltimore,
Maryland
21218
United States
Geographic Scope
Single Zip Code
Related Opportunity
Analysis Notes
Amendment Since initial award the End Date has been extended from 04/30/24 to 04/30/25 and the total obligations have increased 185% from $383,463 to $1,094,142.
The Johns Hopkins University was awarded
Integrative genomic and epigenomic analysis of cancer using long read sequencing
Cooperative Agreement U01CA253481
worth $1,094,142
from National Cancer Institute in May 2021 with work to be completed primarily in Baltimore Maryland United States.
The grant
has a duration of 4 years and
was awarded through assistance program 93.396 Cancer Biology Research.
The Cooperative Agreement was awarded through grant opportunity Early-Stage Development of Informatics Technologies for Cancer Research and Management (U01 Clinical Trial Optional).
Status
(Complete)
Last Modified 9/5/25
Period of Performance
5/1/21
Start Date
4/30/25
End Date
Funding Split
$1.1M
Federal Obligation
$0.0
Non-Federal Obligation
$1.1M
Total Obligated
Activity Timeline
Transaction History
Modifications to U01CA253481
Additional Detail
Award ID FAIN
U01CA253481
SAI Number
U01CA253481-153854768
Award ID URI
SAI UNAVAILABLE
Awardee Classifications
Private Institution Of Higher Education
Awarding Office
75NC00 NIH National Cancer Institute
Funding Office
75NC00 NIH National Cancer Institute
Awardee UEI
FTMTDMBR29C7
Awardee CAGE
5L406
Performance District
MD-07
Senators
Benjamin Cardin
Chris Van Hollen
Chris Van Hollen
Budget Funding
| Federal Account | Budget Subfunction | Object Class | Total | Percentage |
|---|---|---|---|---|
| National Cancer Institute, National Institutes of Health, Health and Human Services (075-0849) | Health research and training | Grants, subsidies, and contributions (41.0) | $710,680 | 100% |
Modified: 9/5/25