Search Prime Grants

2233508

Cooperative Agreement

Overview

Grant Description
SBIR Phase II: Authoring Assistance via Contextual Semantic Labeling - The broader impact/commercial potential of this Small Business Innovation Research (SBIR) Phase II project comes from extracting meaningful, useful, specific information from "dark data." Dark data are the countless documents companies produce and receive, which contain unused information - usually because they are in formats that computers do not understand.

Many of these documents do not even contain accessible text: only pictures of text. Word-processing documents and emails do have text, but no information about what the text is. Computers can easily tell that "10/05/2022" is a date, but knowing it is the date a particular agreement starts or ends (or something else) is needed to make it useful.

This project uses a range of artificial intelligence (AI) techniques that work in real time while people are writing new documents or extracting data from old documents. The AI learns quickly from examples, finds patterns across similar documents, and uses that learning to save the user from having to search for items again and again in varying contexts. This saves a lot of tedious work and reduces errors.

The extracted information helps companies understand, analyze, and make business decisions. This Small Business Innovation Research (SBIR) Phase II project identifies and extracts useful information items from long natural language documents, especially contracts and agreements. The technology identifies items much more specifically than typical extraction methods; for example, not only as person, organization, or place names, but as to what role each plays. Likewise, addresses, dates, money amounts, and other data items only become useful when you know what they're for. This is a valuable focus for advancing natural language understanding.

The team combines and extends machine learning technologies such as few-shot learning, fine-tuning, and semantic parsing to achieve these stronger, more "semantic" results. This solution allows companies to generate value from huge troves of information they already collect but cannot yet automate or leverage.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Awardee
Funding Goals
THE GOAL OF THIS FUNDING OPPORTUNITY, "NSF SMALL BUSINESS INNOVATION RESEARCH PHASE II (SBIR)/ SMALL BUSINESS TECHNOLOGY TRANSFER (STTR) PROGRAMS PHASE II", IS IDENTIFIED IN THE LINK: HTTPS://WWW.NSF.GOV/PUBLICATIONS/PUB_SUMM.JSP?ODS_KEY=NSF22552
Place of Performance
Kirkland, Washington 98034-6933 United States
Geographic Scope
Single Zip Code
Related Opportunity
22-552
Analysis Notes
Amendment Since initial award the End Date has been extended from 05/31/25 to 11/30/25 and the total obligations have increased 20% from $998,314 to $1,194,443.
Docugami was awarded Cooperative Agreement 2233508 worth $1,194,443 from in June 2023 with work to be completed primarily in Kirkland Washington United States. The grant has a duration of 2 years 5 months and was awarded through assistance program 47.084 NSF Technology, Innovation, and Partnerships.

SBIR Details

Research Type
SBIR Phase II
Title
SBIR Phase II:Authoring Assistance via Contextual Semantic Labeling
Abstract
The broader impact/commercial potential of this Small Business Innovation Research (SBIR) Phase II project comes from extracting meaningful, useful, specific information from “dark data.” Dark data are the countless documents companies produce and receive, which contain unused information – usually because they are in formats that computers do not understand. Many of these documents do not even contain accessible text: only pictures of text. Word-processing documents and emails do have text, but no information about what the text is. Computers can easily tell that “10/05/2022” is a date, but knowing it is the date a particular agreement starts or ends (or something else) is needed to make it useful. This project uses a range of artificial intelligence (AI) techniques that work in real time while people are writing new documents or extracting data from old documents. The AI learns quickly from examples, finds patterns across similar documents, and uses that learning to save the user from having to search for items again and again in varying contexts. This saves a lot of tedious work and reduces errors. The extracted information helps companies understand, analyze, and make business decisions._x000D_ _x000D_ This Small Business Innovation Research (SBIR) Phase II project identifies and extracts useful information items from long natural language documents, especially contracts and agreements. The technology identifies items much more specifically than typical extraction methods; for example, not only as person, organization, or place names, but as to what role each plays. Likewise, addresses, dates, money amounts, and other data items only become useful when you know what they’re for. This is a valuable focus for advancing Natural Language Understanding. The team combine and extend Machine Learning technologies such as few-shot learning, fine-tuning, and semantic parsing to achieve these stronger, more “semantic” results. This solution allows companies to generate value from huge troves of information they already collect but cannot yet automate or leverage._x000D_ _x000D_ This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Topic Code
AI
Solicitation Number
NSF 22-552

Status
(Ongoing)

Last Modified 4/17/25

Period of Performance
6/15/23
Start Date
11/30/25
End Date
99.0% Complete

Funding Split
$1.2M
Federal Obligation
$0.0
Non-Federal Obligation
$1.2M
Total Obligated
100.0% Federal Funding
0.0% Non-Federal Funding

Activity Timeline

Interactive chart of timeline of amendments to 2233508

Transaction History

Modifications to 2233508

Additional Detail

Award ID FAIN
2233508
SAI Number
None
Award ID URI
SAI EXEMPT
Awardee Classifications
Small Business
Awarding Office
491503 TRANSLATIONAL IMPACTS
Funding Office
491503 TRANSLATIONAL IMPACTS
Awardee UEI
RU6DPZBACK59
Awardee CAGE
8EN93
Performance District
WA-01
Senators
Maria Cantwell
Patty Murray

Budget Funding

Federal Account Budget Subfunction Object Class Total Percentage
Research and Related Activities, National Science Foundation (049-0100) General science and basic research Grants, subsidies, and contributions (41.0) $998,314 100%
Modified: 4/17/25