Search Prime Grants

2404323

Cooperative Agreement

Overview

Grant Description
Category II: Democratizing the Accelerator Ecosystem for Science and Discovery -Accelerated computing has become an essential capability for advancing science and engineering. The growth of artificial intelligence (AI) and machine learning, and the performance benefits afforded by graphics processing units (GPUs) are driving researchers across nearly every domain to adopt GPUs.

The National Science Foundation has made substantial investments in providing the community with GPU resources, expertise, training, and other programs in support of this transition. The innovative COSMOS system at the San Diego Supercomputer Center (SDSC) features AMD's MI300A Accelerated Processing Unit (APU), which contains both a CPU and a GPU accelerator in a single chip together with high-bandwidth, unified memory.

The unified memory facilitates an incremental programming approach, lowering the barrier to the adoption of GPUs by many communities, easing the process of porting and optimizing applications. COSMOS enables researchers to exploit this innovative and powerful accelerator technology in an open software environment to expand the range of applications that can effectively use accelerators.

The benefits of accelerating the applications described in the proposal will aid discoveries in materials science, genomics, astrophysics, large language models, artificial intelligence, and many other domains. COSMOS nodes contain AMD MI300A APUs, each with high-bandwidth, unified memory, integrated into 4-socket nodes with all-to-all connectivity using AMD's high-speed interconnect, which provides a socket-to-socket global memory interface. The system architecture is based on HPE's EX2500, which provides a dense, energy-efficient, liquid-cooled system.

A high-performance, flash-based storage system provides the high IOPS and bandwidth needed for the anticipated mixed-application workload. The system can be cross-mounted to other SDSC systems to facilitate data sharing, software development, and benchmarking. Capacity storage is provided via a Ceph filesystem.

The project is structured as a three-year testbed phase, followed by a two-year allocations phase. During the testbed phase COSMOS project staff will collaborate with research teams covering several exemplar science and engineering applications including those from astronomy, neuroscience, molecular biology, structural engineering, machine learning and others.

Included are applications that have yet to be ported and those that can already run on GPUs but would benefit from the flexible and open architecture of the APU and its software ecosystem. Collaborations specifically target community codes, science gateways, and enabling middleware, where success in porting a single application brings along many users and institutions.

Integration with the Open Science Grid aims to further extend the benefits of the APU to thousands of users in the high-throughput computing community. Lessons learned and best practices developed from the research collaborations will be shared with the wider user community through project workshops, user training events, and participation in the AMD User Forum.

The allocations phase will incorporate lessons learned from the testbed phase regarding application porting to the APU, leading to software development resources, training materials, and publications that allow others to migrate their applications to realize the benefits of accelerated computing. During the allocations phase, COSMOS will be available to researchers through an NSF-approved allocation process.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.- Subawards are not planned for this award.
Funding Goals
THE GOAL OF THIS FUNDING OPPORTUNITY, "ADVANCED COMPUTING SYSTEMS & SERVICES: ADAPTING TO THE RAPID EVOLUTION OF SCIENCE AND ENGINEERING RESEARCH", IS IDENTIFIED IN THE LINK: HTTPS://WWW.NSF.GOV/PUBLICATIONS/PUB_SUMM.JSP?ODS_KEY=NSF23518
Grant Program (CFDA)
Place of Performance
La Jolla, California 92093-0934 United States
Geographic Scope
Single Zip Code
Analysis Notes
Amendment Since initial award the total obligations have increased 145% from $5,000,000 to $12,249,999.
San Diego University Of California was awarded Accelerating Science and Discovery with GPU Technology Cooperative Agreement 2404323 worth $12,249,999 from the NSF Office of Integrative Activities in July 2024 with work to be completed primarily in La Jolla California United States. The grant has a duration of 5 years and was awarded through assistance program 47.083 Integrative Activities. The Cooperative Agreement was awarded through grant opportunity Advanced Computing Systems & Services: Adapting to the Rapid Evolution of Science and Engineering Research.

Status
(Ongoing)

Last Modified 8/12/25

Period of Performance
7/1/24
Start Date
6/30/29
End Date
23.0% Complete

Funding Split
$12.2M
Federal Obligation
$0.0
Non-Federal Obligation
$12.2M
Total Obligated
100.0% Federal Funding
0.0% Non-Federal Funding

Activity Timeline

Interactive chart of timeline of amendments to 2404323

Transaction History

Modifications to 2404323

Additional Detail

Award ID FAIN
2404323
SAI Number
None
Award ID URI
SAI EXEMPT
Awardee Classifications
Public/State Controlled Institution Of Higher Education
Awarding Office
490509 OFC OF ADV CYBERINFRASTRUCTURE
Funding Office
490106 OFFICE OF INTEGRATIVE ACTIVITIES
Awardee UEI
UYTTZT6G9DT1
Awardee CAGE
50854
Performance District
CA-50
Senators
Dianne Feinstein
Alejandro Padilla
Modified: 8/12/25