FA875024CB130
Definitive Contract
Overview
Government Description
EXPLAINABLE LARGE LANGUAGE MODELS GENERATING RESPONSES ORGANIZED AND VERIFIABLE WITH EVIDENCE (ELLMGROVE)
Alternate Description
ELLMGROVE
Awardee
Awarding / Funding Agency
Place of Performance
Rome, NY 13441 United States
Pricing
Fixed Price
Set Aside
Small Business Set Aside - Total (SBA)
Extent Competed
Full And Open Competition After Exclusion Of Sources
Est. Average FTE
5
Related Opportunity
None
Assured Information Security was awarded
Definitive Contract FA875024CB130 (FA8750-24-C-B130)
for Explainable Large Language Models Generating Responses Organized And Verifiable With Evidence (ELLMGROVE)
worth up to $1,799,899
by Air Force Research Laboratory
in September 2024.
The contract
has a duration of 2 years and
was awarded
through SBIR Topic Trustworthy Generative Artificial Intelligence (GenAI) to Structure Data and Deliver Accurate Insights of Command, Control, Communication and Computer (C4) Systems
with a Small Business Total set aside
with
NAICS 541715 and
PSC AC12
via direct negotiation acquisition procedures with 33 bids received.
SBIR Details
Research Type
Small Business Innovation Research Program (SBIR) Phase II
Title
Explainable Large Language Models Generating Responses Organized and Verifiable with Evidence (ELLMGROVE)
Related Solicitation
Abstract
Assured Information Security, Inc., in collaboration with Georgia Tech Research Institute and Infinity Labs, LLC, proposes Explainable Large Language Models Generating Responses Organized and Verifiable with Evidence (ELLMGROVE), a research effort to develop explainable Large Language Models (LLMs) as Reinforcement Learning (RL) agents to support Command, Control, Communication, and Computer (C4) systems. Since the release of GPT-2 [1], instruction-tuned [2] LLMs have revolutionized the field of Natural Language Processing (NLP), achieving state-of-the-art (SOTA) performance on a variety of tasks. These capabilities make LLMs a promising technology for augmenting C4 systems by providing a means to analyze new intelligence as it arrives and providing actionable recommendations based on emerging situational awareness. In situations where operators, warfighters, and/or commanders use C4 systems, they may only have a short window in which to act on new intelligence, as it becomes stale, useless, or otherwise not actionable in a matter of hours. In such settings, any Artificial Intelligence and Machine Learning (AI/ML) system intended to expedite the exploitation of incoming intelligence must have strong zero-shot capabilities, as the short window in which to act precludes online learning or finetuning. Despite the remarkable abilities of LLMs, deploying them in DoD-relevant settings requires overcoming critical hurdles. Like other AI/ML models, LLMs are black boxes, and there is not a clear understanding of how they can perform linguistic tasks or understand natural language. Moreover, just as clever prompting can elicit reasoning in LLMs [5], it can also manipulate a model into violating the guardrails intended to prevent it from producing offensive, sensitive, or otherwise inappropriate responses [12]. This vulnerability makes it difficult to use LLMs in settings involving classified or otherwise sensitive data as the model may, for example, divulge sensitive information to an operator. LLMs have demonstrated remarkable problem-solving skills through their application to RL. For example, LLMs can reason over long-term objectives, a perennial challenge in RL, while a traditional RL agent focuses on short-term skill building through its interactions with the environment [14]. Alternatively, an LLM can itself become an RL agent if prompted with a description of the environment, the objectives it needs to solve, and guidance on how to reason about its observations [15]. In the context of C4 systems, an LLM-based RL agent can receive observations in the form of incoming intelligence and generate recommendations about how best to respond to them. Importantly, if the agent can cite accurately to the specific information that justifies its responses, including incoming intelligence, military doctrine, policies, etc., the operator can have greater confidence and trust in the LLM and its responses.
Research Objective
The goal of phase II is to continue the R&D efforts initiated in Phase I. Funding is based on the results achieved in Phase I and the scientific and technical merit and commercial potential of the project proposed in Phase II.
Topic Code
AF241-D013
Agency Tracking Number
F2D-11676
Solicitation Number
24.1
Contact
Simon Khan
Status
(Open)
Last Modified 3/27/25
Period of Performance
9/27/24
Start Date
9/27/26
Current End Date
9/27/26
Potential End Date
Obligations
$1.8M
Total Obligated
$1.8M
Current Award
$1.8M
Potential Award
Award Hierarchy
Definitive Contract
FA875024CB130
Subcontracts
Activity Timeline
Subcontract Awards
Disclosed subcontracts for FA875024CB130
Transaction History
Modifications to FA875024CB130
People
Suggested agency contacts for FA875024CB130
Competition
Number of Bidders
33
Solicitation Procedures
Negotiated Proposal/Quote
Evaluated Preference
None
Performance Based Acquisition
Yes
Commercial Item Acquisition
Commercial Item Procedures Not Used
Simplified Procedures for Commercial Items
No
Other Categorizations
Subcontracting Plan
Plan Not Required
Cost Accounting Standards
Exempt
Business Size Determination
Small Business
Defense Program
None
DoD Claimant Code
None
IT Commercial Item Category
Not Applicable
Awardee UEI
PPEKTM9CTAJ3
Awardee CAGE
3CHJ4
Agency Detail
Awarding Office
FA8750 FA8750 AFRL RIK
Funding Office
F4FBEQ
Created By
shannon.sullivan@us.af.mil
Last Modified By
shannon.sullivan@us.af.mil
Approved By
shannon.sullivan@us.af.mil
Legislative
Legislative Mandates
None Applicable
Performance District
NY-22
Senators
Kirsten Gillibrand
Charles Schumer
Charles Schumer
Representative
Brandon Williams
Modified: 3/27/25