Posted: Aug. 22, 2025, 1:58 p.m. EDT
This sources sought notice is extended to allow additional time for potential sources to respond to the Government's market research request.
Description:
In accordance with FAR Part 10.002, the Government is conducting market research to determine if commercial products, commercial services, or nondevelopmental items are available to meet the Government's needs or could be modified to meet the Government's needs. In addition to this notice, the Government anticipates holding interactive, on-line communication among industry, acquisition personnel, and customers and/or conducting interchange meetings or holding presolicitation conferences to involve potential offerors early in the acquisition process. In addition to the submission request below, we request vendors to identify interest in the above for planning purposes.
This is a Sources Sought notice. This is NOT a solicitation for proposals, proposal abstracts, or quotations.
The Government would like to obtain information regarding: (1) the availability and capability of qualified sources and their size classification relative to the North American Industry Classification System (NAICS) code for the proposed acquisition, (2) the availability and capability of qualified small business sources; (3) whether they are small businesses; HUBZone small businesses; service-disabled, veteran-owned small businesses; 8(a) small businesses; veteran-owned small businesses; woman-owned small businesses; or small disadvantaged businesses.
Your responses to the information requested will assist the Government in determining the appropriate acquisition method, including whether a set-aside is possible.
This notice is issued to help determine the availability of qualified companies technically capable of meeting the Government requirement and to determine the method of acquisition. It is not to be construed as a commitment by the Government to issue a solicitation or ultimately award a contract. Responses will not be considered as proposals or quotes. No award will be made as a result of this notice. The Government will NOT be responsible for any costs incurred by the respondents to this notice. This notice is strictly for research and information purposes only.
Background: The mission of the National Center for Advancing Translational Sciences (NCATS) is to transform the translational science process in order to get more treatments to more patients more rapidly. Critical to this mission is the ability to manage data, generate insights, and foster collaboration across National Institutes of Health (NIH) Institutes and Centers and with external partners such as research hospitals, academic institutions, other government agencies, and industry stakeholders. NIH, NCATS, and NCATS collaborators are among the world's leading researchers, and they have access to large, cutting-edge data sources. An increasingly complex challenge is to translate these disparate, ever-evolving data sets into actionable insights that accelerate the pace of science and clinical development. Tools are needed to enable teams to ask complex, multi-faceted questions and to collaborate more seamlessly and securely across various teams.
NCATS has established the NCATS Secure Scientific Platform Environment, a specialized cloud-based data aggregation and analytics enclave that can integrate, manage, secure, and analyze any kind of scientific data, and provide secure, controlled access to internal and external collaborators.
NCATS requires a secure cloud platform-as-a-service (PaaS) that can support the National Clinical Cohort Collaborative (N3C) Data Enclave. The NCATS Secure Scientific Platform Environment (the "Environment") is a specialized cloud-based data aggregation and analytics enclave that can integrate, manage, secure, and analyze any kind of scientific data, and provide secure, controlled access to internal and external collaborators. Within the Environment, multiple NIH institutes and centers (ICs), Federal agencies, and Federal task forces integrate, manage, secure, and analyze all types of scientific data using dedicated platforms, and, equally importantly, make that data available in specific and controlled collaborations with each other and with external collaborators.
Within the Environment, multiple NIH ICs, Federal agencies, and Federal task forces integrate, manage, secure, and analyze all types of scientific data using dedicated platforms, and, equally importantly, make that data available in specific and controlled collaborations with each other and with external collaborators.
The Environment is a mission-critical data management and analysis environment for multiple data management efforts by several NIH ICs and Federal agencies under the leadership of NCATS. The Environment currently uses Palantir Technologies, Inc's Foundry platform as the incumbent contractor.
This platform-as-a-service (PaaS) has supported NCATS, the National Cancer Institute (NCI), the President's Emergency Plan for AIDS Relief (PEPFAR), and the National Clinical Cohort Collaborative (N3C). The Environment has integrated hundreds of live intramural and third-party data sources in support of dozens of ongoing, critical scientific projects that rely on continuous access, data, and analyses within the Platform. The Environment is now the standard means for accessing and collaboratively analyzing NCATS screening data for dozens of investigators at both NCATS and NCI, and for accessing and analyzing RNASeq and several kinds of proteomics data at NCI. The Platform has also supported clinical applications supporting NCATS (the Clinical and Translational Science Awards (CTSA) and Rare Diseases Clinical Research Network (RDCRN)), NCI, and PEPFAR.
Purpose and Objectives: A primary objective of this contract is to provide support for the N3C Data Enclave.
The N3C Data Enclave is a secure platform through which harmonized clinical data provided by contributing members is stored. The data itself can only be accessed through a secure cloud portal hosted by NCATS and cannot be downloaded or removed.
Project requirements: The Contract shall provide the secure platform-as-a-service, software licenses, professional services, and cloud hosting to support the N3C Data Enclave as follows:
Professional Services:
- Host the N3C Data Enclave in AWS GovCloud.
- The current N3C vendor has achieved FedRAMP High Authorization for its Platform as a Service (PaaS) cloud product. The N3C Data Enclave does not require the High Impact Level, but the platform does require, at minimum, a FedRAMP Moderate Authorized PaaS offering.
- Configure the Environment's data ingestion infrastructure to support N3C's requirement to ingest, join, clean, and harmonize clinical data from all participating clinical sites. Clinical data from sites will be ingested, harmonized, and structured to significantly increase the scale of the research data asset.
- Integrate additional medical and clinical ontologies and other datasets. This will enable N3C researchers to build a more comprehensive knowledge base and link it to their scientific research to enable novel tagging and discovery workflows and provide the potential to explore new hypotheses. The Contractor will also configure additional live data connections to build upon the foundational data asset, as requested by NIH and the research community. This will enable EHR data to be linked to other types of data, such as pathology, genomics, and social determinants.
- Enable research projects on top of N3C through configuration of analytical templates, toolkits, packages, and workflows. The Contractor will continue to work closely with N3C researchers to collaboratively scope and configure additional research workflows within the N3C Platform. By integrating data from multiple sites, the platform will enable researchers to explore questions with vastly more statistical power than is achievable at individual clinical sites, which is essential for better understanding.
- Configure the Environment's machine learning and artificial intelligence (AI) framework to enable more advanced research and analysis of large-scale clinical data. The Environment must be configured to perform production-grade machine learning analyses including using graphical processing units (GPUs) on clinical data. These capabilities enable more advanced research maximize the usefulness of a centralized approach to analytics. The Environment must support full model lifecycle management, including building, tuning, retraining, evaluating, and monitoring.
- Train and onboard N3C researchers on collaborative analytics tooling within N3C. In a train the trainer approach, the contractor, in collaboration with NCATS, will continue to train and onboard extramural researchers in both the US and other approved locations, such as the European Union (EU), to fulfill the vision of a truly shared, collaborative research infrastructure centered around the NIH and its extramural programs.
Software Requirements:
- A commercial software solution deployable on day one of the project that can be configured within expedited timelines.
- An open data architecture, where data always remains under the full control of NCATS and other data owners and can be easily exported in open, non-proprietary data formats via open APIs. The software should be built on an open, distributed microservices architecture with well-documented REST APIs and out-of-the-box connectors that are designed to seamlessly interface with other systems, adapt to meet evolving needs, and avoid system lock-in.
- Proven multi-modal data integration capabilities, including the ability to rapidly ingest electronic health/medical record (EHR/EMR) data (including OMOP, TriNetX, ACT, PCORnet, etc.), pathology samples and assay data, unprocessed high-throughput drug screening (HTS) outputs, genomic data (including bulk RNA-seq, scRNA-seq, CITEseq, TCRseq, ChIPseq, Microarray, etc.), imaging data (e.g., MRIs), mass spectrometry, flow cytometry, and other data types used in basic and translational biosciences research and public health, such as administrative, financial, and grants data (e.g., nVision, IMPAC II, I2E, myDCEG, ARS, NIDB, PubMed), and supply chain data. Backed by configurable and interoperable data quality checks and a Git repository for data pipelining.
- A multi-tenant secure enclave backed by configurable governance and access policies. Ability to host multiple individual tenants, with subsets of data shareable with different parties in the model as desired and in accordance with access controls. Access to a single view of multi-modal data based on user group and/or role.
- Proven granular security controls with the ability for data owners to control all downstream uses of the originating data easily and dynamically, and the ability to conform to NCATS security policies. Ability to request and grant selective access to levels of data sensitivity in-platform based on a user's intended purpose, implement configurable governance workflows depending on the requirements of data use and data transfer agreements, and audit user behavior after access has been provisioned.
- The ability to maintain data and scientific provenance and reproducibility of all integrated data sources. Every resource (dataset, analysis, code, plot, report) contains provenance, metadata, and can be both traced back to the exact version of all upstream dependencies, and where the dependency tree can be easily replayed given new data or updated analysis logic, while still retaining prior versions and branches.
- Dynamic data model, object-based search/discoverability, and analysis workflows, allowing easy definition of objects, properties, and links that propagate from a source table. Solution provides natural ways to move between tabular and object-oriented interfaces and data analyses. Proven ability to integrate multi-data model data into a harmonized data model (such as OMOP).
- Intuitive, highly configurable user interfaces that have been effectively configured and utilized by technical bioinformaticians, cheminformaticians, data engineers, and data scientists, as well as less technical biologists, chemists, clinicians, analysts, program managers, administrators, and other users.
- Ability to perform advanced analytics and informatics (including management of machine learning and other models) in a user's preferred open coding language, as well as in point and click tools, all within the same environment. Ability to generate no-code analytical templates enabling less technical users to conduct complex analyses and generate visualizations.
- A variety of proven, secure, and user-oriented configurable applications and workflows backed by configurable access controls and up-to-date data, including:
10.1. Patient digital twin capability for tens of thousands of patients backed by multi-modal data (e.g., clinical, imaging, and tumor sequencing/mutation data).
10.2. Laboratory sample and result tracking system.
10.3. Streamlined research funding analysis, tracking, and reporting interface for improved funding estimates and budget oversight.
10.4. Genomic pipeline code templates for generating analysis and publication-ready visualizations.
10.5. Application for sharing and re-use of research outputs. A centralized space where logic, datasets, models, and other research outputs can be securely shared, discovered, and re-used by other researchers. Usage of each artifact should automatically be tracked to ensure attribution for contributing researchers. The application should ensure that use of shared artifacts is compliant with governance rules around data use.
10.6. Application for creation and management of code sets. This should allow automatic integration and updates for multiple terminologies and include the ability to version code sets, track their usage, and document them with metadata such as their provenance and intention. Changes to vocabularies should be tracked and users should be alerted when these changes impact existing code sets.
- Collaboration capabilities enabling teams comprising of a range of technical and less technical roles to work seamlessly and concurrently on the same data, build on insights, merge similar analytical paths, and track progress in one place.
- Ability to scale flexibly with increasing users, data, and pipeline complexity, while providing fine-grained ways to adjust resource consumption including auto-scalable containerized compute. Proven ability to scale up to thousands of users (including thousands of potential outside collaborators globally), petabytes of raw and processed data, daily updates in the terabytes, and management of thousands of complex bioinformatic pipelines requiring processing components developed in a variety of languages and environments.
- Proven interoperability with NCATS' current IT investment landscape. Includes omnipresent APIs and plugin points that allow the system to keep up with the changing needs of NCATS and partners, and support both third-party software and other analytic applications. NCATS also requires the ability to independently implement new configurations, plugins, integrations, and extensions to meet new and unforeseen needs and interface with external systems. Ability for less technical users to develop workflows and applications in a low-code/no-code suite.
Anticipated period of performance: The anticipated period of performance will be a 3-year ordering period (Indefinite Delivery/Indefinite Quantity Contract).
Capability statement /information sought.
For the purposes of responding to this notice, vendors should demonstrate how their software meets the Professional Services requirements (6 items) and Software Requirements (13 items) identified above.
Vendors that believe they possess the capabilities to provide the required services should submit documentation of their ability to meet each of the project requirements to the Contracting Officer. The capability statement must specifically address each of the project requirements separately. Additionally, the capability statement should include 1) the total number of employees, 2) the professional qualifications of personnel as it relates to the requirements outlined, 3) any contractor GSA Schedule contracts and/or other government-wide acquisition contracts (GWACs) by which all of the requirements may be met, if applicable, and 4) any other information considered relevant to this program. Vendors must also provide their Company Name, Unique Entity ID from SAM.gov, Physical Address, and Point of Contact Information.
Interested small businesses are required to identify their type of business, applicable North American Industry Classification System (NAICS) Code, and size standards in accordance with the Small Business Administration. The government requests that no proprietary or confidential business data be submitted in a response to this notice. However, responses that indicate the information therein is proprietary will be properly safeguarded for Government use only. Capability statements must include the name and telephone number of a point of contact having authority and knowledge to discuss responses with Government representatives. Capability statements in response to this market survey that do not provide sufficient information for evaluation will be considered non-responsive. When submitting this information, please reference the notice number.
The respondent must also provide their Unique Entity ID from SAM.gov, organization name, address, point of contact, and size and type of business (e.g., 8(a), HubZone, etc., pursuant to the applicable NAICS code and any other information that may be helpful in developing or finalizing the acquisition requirements.
The information submitted must be in and outline format that addresses each of the elements of the project requirement and in the capability statement /information sought paragraphs stated herein. A cover page and an executive summary may be included but is not required.
The response is limited to a twenty (20) page limit. The 20-page limit includes the cover page, executive summary, or references, if requested.
The response must include the respondents' technical and administrative points of contact, including names, titles, addresses, telephone numbers, and e-mail addresses.
All responses to this notice must be submitted by email to the Contract Specialist and Contracting Officer.
The response must be submitted to Brian O'Laughlin, Contracting Officer, at e-mail address olaughlinb@nida.nih.gov.
The response must be received on or before Sep 02, 2025 10:00 AM EDT, Eastern Time.
Disclaimer and Important Notes: This notice does not obligate the Government to award a contract or otherwise pay for the information provided in response. The Government reserves the right to use information provided by respondents for any purpose deemed necessary and legally appropriate. Any organization responding to this notice should ensure that its response is complete and sufficiently detailed to allow the Government to determine the organization's qualifications to perform the work.
Respondents are advised that the Government is under no obligation to acknowledge receipt of the information received or provide feedback to respondents with respect to any information submitted. After a review of the responses received, a presolicitation synopsis and solicitation may be published in Federal Business Opportunities. However, responses to this notice will not be considered adequate responses to a solicitation.
Confidentiality: No proprietary, classified, confidential, or sensitive information should be included in your response. The Government reserves the right to use any non-proprietary technical information in any resultant solicitation(s).