HR001123C0122
Definitive Contract
Overview
Government Description
SMALL BUSINESS INNOVATION RESEARCH (SBIR) PHASE II PROGRAM, REPLICANTS: SYNTHETIC USER PERSONAS FOR CYBER SECURITY EXPERIMENTATION
Awardee
Awarding / Funding Agency
PSC
Place of Performance
Reston, VA 20190 United States
Pricing
Fixed Price
Set Aside
Small Business Set Aside - Total (SBA)
Extent Competed
Full And Open Competition After Exclusion Of Sources
Est. Average FTE
3
Punch Cyber was awarded
Definitive Contract HR001123C0122 (HR0011-23-C-0122)
worth up to $1,792,869
by Defense Advanced Research Projects Agency
in July 2023.
The contract
has a duration of 3 years and
was awarded
through solicitation Small Business Innovation Research (SBIR) and Small Business Technology Transfer (STTR)
with a Small Business Total set aside
with
NAICS 541715 and
PSC AC15
via direct negotiation acquisition procedures with 1 bid received.
As of today, the Definitive Contract has a total reported backlog of $595,143.
SBIR Details
Research Type
Small Business Innovation Research Program (SBIR) Phase II
Title
Replicants: Synthetic User Personas for Cyber Security Experimentation
Abstract
???????PUNCH proposes Replicant, a five-component framework that provides: state-awareness, actuation, behavior modeling, data labeling, and orchestration / management of synthetic users without leaving traces of Replicant itself. Replicant's Perception, Driver, and Persona components are jump started by our work on the DARPA CHASE program where PUNCH: (1) created (in collaboration with FiveDirections) a working prototype version of SUP that is able to receive visual context from the host, determine its state, provide error handling, and drive synthetic benign user actions via the hypervisor so as not to create any spurious artifacts, and (2) leveraged MITRE's CALDERA framework to drive attacks that mimic APT adversaries. A primary goal and use-case of Replicant is to create labeled data sets that can be used to train, test, and evaluate cyber tools, AI/ML models, and analyst capabilities. Replicant's Data Labeling component extends and leverages PUNCH's development of: (1) Context-Aware Adaptive Data Operations (CADO), a framework to contextualize and score cyber relevancy of logs; and (2) Cyber Snorkel, a weak labeling framework to tie together attack activity reporting and cyber log data with probabilistic labels, such as MITRE ATT&CK techniques. The current SUP prototype is multi-threaded, built to scale, and is deployed and orchestrated via Ansible and VMWare vSphere. In this two-year program, development on the Replicant components will focus toward extending and expanding the realism of user personas; their ability to better contextualize and adapt to a broader set of states and applications; extensibility and flexibility to synthesize across different devices, operating systems, and configurations; improving the precision and recall of its data labeling; and the maturation of the management/orchestration interfaces and concept of operations documentation. More specifically, PUNCH will: (1) leverage advancements in OCR, NLP, and image recognition CNNs for greater perception and performance; (2) create an image collection and annotation pipeline for training advanced image recognition models; (3) leverage advancements from human and agent-based behavior modeling for greater persona realism; (4) leverage content repositories and advanced language models (e.g. GPT3) to create more realistic persona created content; (5) extend support to a number of attacker frameworks and tools (CobaltStrike, Sliver, Metasploit) and benign user applications, operating systems, devices, and configurations, such as support for mobile and IoT devices, foreign languages, and administrative tools; (6) leverage SIGMA rules and PUNCH's Cyber Snorkel weak labeling framework to generate labeled datasets from ground-truth artifacts in Replicant and attacker logs; and (7) perform multiple full-scale cyber exercises to demonstrate Replicant's ability to scale to 500+ hosts and generate exemplar labeled datasets for machine learning use cases.
Research Objective
The goal of phase II is to continue the R&D efforts initiated in Phase I. Funding is based on the results achieved in Phase I and the scientific and technical merit and commercial potential of the project proposed in Phase II.
Topic Code
HR0011SB20234-02
Agency Tracking Number
D2D-0459
Solicitation Number
23.4
Contact
Michael Geide
Status
(Open)
Last Modified 6/7/24
Period of Performance
7/20/23
Start Date
7/20/25
Current End Date
7/20/26
Potential End Date
Obligations and Backlog
$1.2M
Total Obligated
$1.2M
Current Award
$1.8M
Potential Award
$0.0
Funded Backlog
$595.1K
Total Backlog
Award Hierarchy
Definitive Contract
HR001123C0122
Subcontracts
Activity Timeline
Opportunity Lifecycle
Procurement history for HR001123C0122
Transaction History
Modifications to HR001123C0122
People
Suggested agency contacts for HR001123C0122
Competition
Number of Bidders
1
Solicitation Procedures
Negotiated Proposal/Quote
Evaluated Preference
None
Commercial Item Acquisition
Commercial Item Procedures Not Used
Simplified Procedures for Commercial Items
No
Other Categorizations
Subcontracting Plan
Plan Not Required
Cost Accounting Standards
Exempt
Business Size Determination
Small Business
Defense Program
None
DoD Claimant Code
None
IT Commercial Item Category
Not Applicable
Awardee UEI
H95DHH2NJ473
Awardee CAGE
5KGK2
Agency Detail
Awarding Office
HR0011 DEF ADVANCED RESEARCH PROJECTS AGCY
Funding Office
HR0011
Created By
james.ritch.hr0011@darpa.mil
Last Modified By
james.ritch.hr0011@darpa.mil
Approved By
james.ritch.hr0011@darpa.mil
Legislative
Legislative Mandates
None Applicable
Performance District
VA-11
Senators
Mark Warner
Timothy Kaine
Timothy Kaine
Representative
Gerald Connolly
Modified: 6/7/24