HR001123C0118
Definitive Contract
Overview
Government Description
WORKABLE HIERARCHICAL IMPERSONATION USING REINFORCEMENT LEARNING (WHIRL) PROJECT
Awardee
Awarding / Funding Agency
Place of Performance
Arlington, VA 22203 United States
Pricing
Cost Plus Fixed Fee
Set Aside
Small Business Set Aside - Total (SBA)
Extent Competed
Full And Open Competition After Exclusion Of Sources
Est. Average FTE
5
Related Opportunity
Analysis Notes
Amendment Since initial award the Potential End Date has been extended from 08/14/26 to 12/25/26 and the Potential Award value has increased 56% from $1,791,604 to $2,786,996.
Cynnovative was awarded
Definitive Contract HR001123C0118 (HR0011-23-C-0118)
for Workable Hierarchical Impersonation Using Reinforcement Learning (WHIRL) Project
worth up to $2,786,996
by Defense Advanced Research Projects Agency
in August 2023.
The contract
has a duration of 3 years 4 months and
was awarded
through solicitation SBIR Ph II Workable Hierarchical Impersonation using Reinforcement Learning (WHIRL)
with a Small Business Total set aside
with
NAICS 541715 and
PSC AC12
via direct negotiation acquisition procedures with 3 bids received.
As of today, the Definitive Contract has a total reported backlog of $1,259,263 and funded backlog of $662,059.
SBIR Details
Research Type
Small Business Innovation Research Program (SBIR) Phase II
Title
Workable Hierarchical Impersonation using Reinforcement Learning (WHIRL)
Abstract
Workable Hierarchical Impersonation using Reinforcement Learning (WHIRL) will generate realistic synthetic data without artifacts at scale by utilizing hierarchical reinforcement learning and a hypervisor to allow for off-box execution of long-term goals, mid-term tasks, passed through a shim to a hypervisor that will execute them on the intended host. Team Cynnovative will use hierarchical reinforcement learning to simulate user behavior at the level a real user would: keyboard and mouse activity and observing a monitor. By simulating on real hardware and executing off-box, WHIRL enables the collection of the generated synthetic data via any traditional means a WHIRL user desires without introducing any artifacts or biases. User persona research for SUP is a method of understanding the characteristics and behavior patterns of specific groups of users to gain insights into the motivations, goals, and needs of these user groups and inform the design and development of effective cybersecurity strategies. This research seeks to understand how users interact with network systems, applications, and data to design policies that enable a user to operate successfully while maintaining a robust security posture. The autonomous agent for WHIRL is rewarded for taking actions to achieve a goal, such as browsing the web or using an excel sheet or operating in a terminal. Feedback from the environment informs the agent how well it accomplishes the task. A fully trained agent can act as a defined synthetic user enables the generation and collection of robust datasets representative of realistic user behavior. Team Cynnovative's solution will be operating on the raw pixels of a screen capture which puts reinforcement learning in a real-world domain with an observation space where it has succeeded in the past, effectively eliminating the simulation to real-world problems. Reinforcement learning can operate on pixel data and elicit realistic, emergent behaviors with ground truth. WHIRL operates off-box on real hardware (via hypervisor), enabling the collection of synthetic data the way the data is normally collected. This means that the users of the WHIRL system will not have to worry about learning how to collect via another platform but will enable them to leverage existing knowledge and tools and not worry about filtering for any artifacts or biases.
Research Objective
The goal of phase II is to continue the R&D efforts initiated in Phase I. Funding is based on the results achieved in Phase I and the scientific and technical merit and commercial potential of the project proposed in Phase II.
Topic Code
HR0011SB20234-02
Agency Tracking Number
D2D-0469
Solicitation Number
23.4
Contact
Matt Puglisi
Status
(Open)
Last Modified 12/26/24
Period of Performance
8/14/23
Start Date
12/25/26
Current End Date
12/25/26
Potential End Date
Obligations and Backlog
$1.5M
Total Obligated
$2.2M
Current Award
$2.8M
Potential Award
$662.1K
Funded Backlog
$1.3M
Total Backlog
Award Hierarchy
Definitive Contract
HR001123C0118
Subcontracts
Activity Timeline
Opportunity Lifecycle
Procurement history for HR001123C0118
Transaction History
Modifications to HR001123C0118
People
Suggested agency contacts for HR001123C0118
Competition
Number of Bidders
3
Solicitation Procedures
Negotiated Proposal/Quote
Evaluated Preference
None
Commercial Item Acquisition
Commercial Item Procedures Not Used
Simplified Procedures for Commercial Items
No
Other Categorizations
Subcontracting Plan
Plan Not Required
Cost Accounting Standards
Exempt
Business Size Determination
Small Business
Defense Program
None
DoD Claimant Code
None
IT Commercial Item Category
Not Applicable
Awardee UEI
L6NJEPKMK7U5
Awardee CAGE
7WMG9
Agency Detail
Awarding Office
HR0011 DEF ADVANCED RESEARCH PROJECTS AGCY
Funding Office
HR0011
Created By
james.ritch.hr0011@darpa.mil
Last Modified By
james.ritch.hr0011@darpa.mil
Approved By
james.ritch.hr0011@darpa.mil
Legislative
Legislative Mandates
None Applicable
Performance District
VA-08
Senators
Mark Warner
Timothy Kaine
Timothy Kaine
Representative
Donald Beyer
Modified: 12/26/24