R01HL159077
Project Grant
Overview
Grant Description
Bayesian Machine Learning for Causal Inference with Incomplete Longitudinal Covariates and Censored Survival Outcomes - Project Summary
Population cohort studies funded by the National Institute of Health, including the Atherosclerosis Risk in Communities (ARIC) study and Multi-Ethnic Study of Atherosclerosis (MESA), are widely used in cardiovascular research and have provided fundamental knowledge for cardiovascular disease (CVD) prevention strategies and public health policies.
Pooling data across multiple cohorts provides a unique opportunity for in-depth investigations of emerging CVD research questions, such as optimal blood pressure threshold values triggering initiation of antihypertensive treatment for young adults, that heretofore would not have been possible. While forming a fertile ground for innovative research, the methodological issues associated with the pooled cohorts data cannot be as effectively addressed by existing statistical methods. There are three main analytic challenges.
First, many discrete or continuous longitudinal variables have missing values with various missing data patterns. Existing methods either are susceptible to misspecification biases or do not provide coherent estimates of imputation uncertainty, and cannot handle missing not at random.
Second, current causal inference methods either require aligned measurement time points or parametric assumptions about forms of causal pathways, neither of which can be satisfied in complex longitudinal health data.
Third, violations of the "sequential ignorability" assumption embedded in causal inference methodology can be a potential source of bias. The sensitivity analysis methods for time-varying confounding with censored survival outcomes are underdeveloped.
To overcome these challenges and improve statistical and CVD research, we propose a suite of generalizable statistical methods utilizing machine learning. We propose to develop a scalable Bayesian nonparametric (BNP) framework to impute continuous or discrete missing at random longitudinal covariates while providing coherent uncertainty intervals, and address the missing not at random mechanism via sensitivity analysis. We will apply the developed method to address missing data issues for several longitudinal CVD risk factors such as blood pressure, cholesterol levels (Specific Aim 1).
To develop a robust and computationally efficient BNP causal inference method (Specific Aim 2) and a new continuous-time marginal structural survival model from a Bayesian perspective (Specific Aim 3) to study and validate the survival effects of time-varying antihypertensive treatments for young adults and the frail elderly.
To develop a flexible and interpretable survival sensitivity analysis method to assess the sensitivity of the causal effect estimates to varying degrees of sequential unmeasured confounding (Specific Aim 4).
And to create usable R software packages for all proposed methods and develop tutorial papers and short courses to bridge theoretical and practical knowledge and promote use of our methods (Specific Aim 5).
Population cohort studies funded by the National Institute of Health, including the Atherosclerosis Risk in Communities (ARIC) study and Multi-Ethnic Study of Atherosclerosis (MESA), are widely used in cardiovascular research and have provided fundamental knowledge for cardiovascular disease (CVD) prevention strategies and public health policies.
Pooling data across multiple cohorts provides a unique opportunity for in-depth investigations of emerging CVD research questions, such as optimal blood pressure threshold values triggering initiation of antihypertensive treatment for young adults, that heretofore would not have been possible. While forming a fertile ground for innovative research, the methodological issues associated with the pooled cohorts data cannot be as effectively addressed by existing statistical methods. There are three main analytic challenges.
First, many discrete or continuous longitudinal variables have missing values with various missing data patterns. Existing methods either are susceptible to misspecification biases or do not provide coherent estimates of imputation uncertainty, and cannot handle missing not at random.
Second, current causal inference methods either require aligned measurement time points or parametric assumptions about forms of causal pathways, neither of which can be satisfied in complex longitudinal health data.
Third, violations of the "sequential ignorability" assumption embedded in causal inference methodology can be a potential source of bias. The sensitivity analysis methods for time-varying confounding with censored survival outcomes are underdeveloped.
To overcome these challenges and improve statistical and CVD research, we propose a suite of generalizable statistical methods utilizing machine learning. We propose to develop a scalable Bayesian nonparametric (BNP) framework to impute continuous or discrete missing at random longitudinal covariates while providing coherent uncertainty intervals, and address the missing not at random mechanism via sensitivity analysis. We will apply the developed method to address missing data issues for several longitudinal CVD risk factors such as blood pressure, cholesterol levels (Specific Aim 1).
To develop a robust and computationally efficient BNP causal inference method (Specific Aim 2) and a new continuous-time marginal structural survival model from a Bayesian perspective (Specific Aim 3) to study and validate the survival effects of time-varying antihypertensive treatments for young adults and the frail elderly.
To develop a flexible and interpretable survival sensitivity analysis method to assess the sensitivity of the causal effect estimates to varying degrees of sequential unmeasured confounding (Specific Aim 4).
And to create usable R software packages for all proposed methods and develop tutorial papers and short courses to bridge theoretical and practical knowledge and promote use of our methods (Specific Aim 5).
Funding Goals
THE NATIONAL HEART, LUNG, AND BLOOD INSTITUTE (NHLBI) PROVIDES GLOBAL LEADERSHIP FOR A RESEARCH, TRAINING, AND EDUCATION PROGRAM TO PROMOTE THE PREVENTION AND TREATMENT OF HEART, LUNG, AND BLOOD DISEASES AND ENHANCE THE HEALTH OF ALL INDIVIDUALS SO THAT THEY CAN LIVE LONGER AND MORE FULFILLING LIVES. TO FOSTER HEART AND VASCULAR RESEARCH IN THE BASIC, TRANSLATIONAL, CLINICAL AND POPULATION SCIENCES, AND TO FOSTER TRAINING TO BUILD TALENTED YOUNG INVESTIGATORS IN THESE AREAS, FUNDED THROUGH COMPETITIVE RESEARCH TRAINING GRANTS. SMALL BUSINESS INNOVATION RESEARCH (SBIR) PROGRAM: TO STIMULATE TECHNOLOGICAL INNOVATION; USE SMALL BUSINESS TO MEET FEDERAL RESEARCH AND DEVELOPMENT NEEDS; FOSTER AND ENCOURAGE PARTICIPATION IN INNOVATION AND ENTREPRENEURSHIP BY SOCIALLY AND ECONOMICALLY DISADVANTAGED PERSONS; AND INCREASE PRIVATE-SECTOR COMMERCIALIZATION OF INNOVATIONS DERIVED FROM FEDERAL RESEARCH AND DEVELOPMENT FUNDING. SMALL BUSINESS TECHNOLOGY TRANSFER (STTR) PROGRAM: TO STIMULATE TECHNOLOGICAL INNOVATION; FOSTER TECHNOLOGY TRANSFER THROUGH COOPERATIVE R&D BETWEEN SMALL BUSINESSES AND RESEARCH INSTITUTIONS, AND INCREASE PRIVATE SECTOR COMMERCIALIZATION OF INNOVATIONS DERIVED FROM FEDERAL R&D.
Grant Program (CFDA)
Awarding / Funding Agency
Place of Performance
Newark,
New Jersey
071073001
United States
Geographic Scope
Single Zip Code
Related Opportunity
Analysis Notes
Amendment Since initial award the total obligations have increased 354% from $722,305 to $3,275,727.
Rutgers The State University Of New Jersey was awarded
Bayesian Machine Learning Causal Inference in Cardiovascular Research
Project Grant R01HL159077
worth $3,275,727
from National Heart Lung and Blood Institute in May 2022 with work to be completed primarily in Newark New Jersey United States.
The grant
has a duration of 5 years and
was awarded through assistance program 93.837 Cardiovascular Diseases Research.
The Project Grant was awarded through grant opportunity NIH Research Project Grant (Parent R01 Clinical Trial Not Allowed).
Status
(Ongoing)
Last Modified 5/5/26
Period of Performance
5/15/22
Start Date
4/30/27
End Date
Funding Split
$3.3M
Federal Obligation
$0.0
Non-Federal Obligation
$3.3M
Total Obligated
Activity Timeline
Transaction History
Modifications to R01HL159077
Additional Detail
Award ID FAIN
R01HL159077
SAI Number
R01HL159077-1474390505
Award ID URI
SAI UNAVAILABLE
Awardee Classifications
Public/State Controlled Institution Of Higher Education
Awarding Office
75NH00 NIH National Heart, Lung, and Blood Institute
Funding Office
75NH00 NIH National Heart, Lung, and Blood Institute
Awardee UEI
YVVTQD8CJC79
Awardee CAGE
6VL59
Performance District
NJ-10
Senators
Robert Menendez
Cory Booker
Cory Booker
Budget Funding
| Federal Account | Budget Subfunction | Object Class | Total | Percentage |
|---|---|---|---|---|
| National Heart, Lung, and Blood Institute, National Institutes of Health, Health and Human Services (075-0872) | Health research and training | Grants, subsidies, and contributions (41.0) | $1,389,101 | 100% |
Modified: 5/5/26