Search Prime Grants

R01AI169543

Project Grant

Overview

Grant Description
Rapid Response for Pandemics: Single Cell Sequencing and Deep Learning to Predict Antibody Sequences against an Emerging Antigen - Abstract

One of the "holy grails" in immunology is to be able to directly predict tight-binding variable chain antibody sequences in silico against foreign or non-self "antigenic" proteins. Immunoglobulin chain rearrangement can potentially encode approximately 10^16 different variants of antibody heavy and light chain sequences. However, only a small fraction of the sequence space is generally accessed for evolving antibodies against foreign proteins.

The computational challenge is to go from a model of the structure of an antigen to predicting a set of antibody chain sequences that can bind tightly to the antigen. If solved, it might be possible to move in less than 24 hours from the first cryo-electron-microscopic structure of a novel viral protein to advance a set of potent antibody-like molecular candidates for testing.

Towards solving this problem, this project aims to develop a deep learning architecture that will take as input thermodynamic, quantum mechanical (density functional), and local structure-based network topographical features of the antigens and their cognate antibodies, and will output their respective binding affinity constants. We will design a generative adversarial network (GAN), which we think is uniquely suited for regression-based machine learning approaches for the immune system, to discover associations between the epitope and the variable chain features.

This approach requires a large data stream of antigen and cognate antibody sequences, which until recently was difficult to obtain. A recently described single B-cell receptor (BCR) specific tagging method coupled with single cell deep sequencing ("Linking B Cell Receptor to Antigen Specificity through Sequencing" or LIBRA-SEQ) can rapidly isolate and sequence the BCR variable chain coding regions that can bind with high selectivity to antigenic epitopes.

Towards the specific project goals, in Task 1, LIBRA-SEQ will be used to rapidly identify and generate candidate immunoglobulin coding sequences in response to specific linear and nonlinear epitopes (against controls), chosen through computational/molecular modeling and prioritized with SARS-CoV-2 spike protein epitopes (but not restricted to these), injected into a mouse model, to generate large training sets. In Task 2, these training sets, along with other data sets already available in public databases, will generate a series of structural features (described above), which will be used to train the GAN. In Task 3, the predicted epitope-antibody interactions will be validated by direct experiments with synthetic antibody and phage-display systems.

Thus, the proposed strategy combines foundational principles in evolutionary biology, genomics, structural chemistry, and computer science to the solution of a general biological engineering problem. Results from this project are expected to lay the foundations for a rigorously tested and fully automated machine-learning system that could rapidly generate synthetic antibody candidates from the structure of a novel virus protein, which can enhance the rapid response ability against a future pandemic. The ability to develop targeted antibody therapy against non-infectious or chronic diseases, and on the production of antibody-based industrial enzymes, will also be dramatically enhanced if this project were to be successful.

The Team: The team-leads of this multi-institutional research project comprise a computer scientist, a protein crystallographer, an immunologist, and a molecular biologist.
Funding Goals
NOT APPLICABLE
Place of Performance
California United States
Geographic Scope
State-Wide
Analysis Notes
COVID-19 $1,219,945 (40%) percent of this Project Grant was funded by COVID-19 emergency acts including the CARES Act.
Amendment Since initial award the End Date has been extended from 08/31/24 to 08/31/25 and the total obligations have increased 66% from $1,851,627 to $3,071,572.
Keck Graduate Institute Of Applied Life Sciences was awarded Rapid Response: Single Cell Sequencing & Deep Learning Antibody Prediction Project Grant R01AI169543 worth $3,071,572 from the National Institute of Allergy and Infectious Diseases in September 2021 with work to be completed primarily in California United States. The grant has a duration of 4 years and was awarded through assistance program 93.310 Trans-NIH Research Support. The Project Grant was awarded through grant opportunity NIH Directors Emergency Transformative Research Awards (R01 Clinical Trial Optional).

Status
(Complete)

Last Modified 4/21/25

Period of Performance
9/16/21
Start Date
8/31/25
End Date
100% Complete

Funding Split
$3.1M
Federal Obligation
$0.0
Non-Federal Obligation
$3.1M
Total Obligated
100.0% Federal Funding
0.0% Non-Federal Funding

Activity Timeline

Interactive chart of timeline of amendments to R01AI169543

Transaction History

Modifications to R01AI169543

Additional Detail

Award ID FAIN
R01AI169543
SAI Number
R01AI169543-3787174297
Award ID URI
SAI UNAVAILABLE
Awardee Classifications
Private Institution Of Higher Education
Awarding Office
75NM00 NIH National Institute of Allergy and Infectious Diseases
Funding Office
75NA00 NIH OFFICE OF THE DIRECTOR
Awardee UEI
GJLMWYFPMQZ7
Awardee CAGE
1YCP1
Performance District
CA-90
Senators
Dianne Feinstein
Alejandro Padilla

Budget Funding

Federal Account Budget Subfunction Object Class Total Percentage
Office of the Director, National Institutes of Health, Health and Human Services (075-0846) Health research and training Grants, subsidies, and contributions (41.0) $1,219,945 100%
Modified: 4/21/25