Description

The purpose of the Making Electronic Data More Available for Research and Public Health (MedMorph) Research Data Exchange Implementation Guide (IG) is to streamline and expedite the process of onboarding research data partners and contributing data to research networks. The current processes used to extract, transform, and load (ETL) the data and then contribute the data involve many non-standardized mechanisms (e.g., Secure File Transfer Protocol (SFTP), Excel Files, stored procedures), different structures (e.g., formats), and different semantics. As a result of these non-standardized processes, the length of time to onboard a data partner varies from weeks to months. This research data exchange use case along with leveraging the MedMorph Reference Architecture (RA) IG with other existing Health Level 7 (HL7^®) Fast Healthcare Interoperability Resources (FHIR^®) IGs will help reduce the length of time it takes to onboard new data exchange partners.

Problem Statement

Goals of the Use Case

The goals of the Research Data Exchange use case include:

Improve efficiency in onboarding new data exchange partners
Improve data latency
Increase the volume, quality, completeness, and timeliness of data submitted to research organization at a lower cost
Provide standardize way to collect and exchange data for research
Leverage the MedMorph RA IG
Streamline data element mapping from the data mart

Scope of the Use Case

In-Scope

Bulk data exchange
Identify the data elements to be retrieved from the Data Source for research needs

Out-of-Scope

State and local policies around data exchange for research
Assessment of the data quality of the content extracted from the Data Source.
Data captured outside the Data Source and communicated directly to data marts.
Data exchange/Data use agreements between data source and research organization
Utilization of unstructured data (e.g., images or text blobs)

Use Case Actors

Data Source: A system (e.g., EHR, HIE) used in care delivery for patients and that captures and stores data about patients and makes the information available instantly and securely to authorized users. While an EHR does contain the medical and treatment histories of patients, an EHR system is built to go beyond standard clinical data collected in a provider’s provision of care location and can be inclusive of a broader view of a patient’s care. EHRs are a vital part of health IT and can:

Contain a patient’s medical history, diagnoses, medications, treatment plans, immunization dates, allergies, radiology images, and laboratory and test results
Allow access to evidence-based tools that providers can use to make decisions about a patient’s care
Automate and streamline provider workflow

A FHIR Enabled Data Source exposes FHIR APIs for other systems to interact with the EHR and exchange data. FHIR APIs provide well defined mechanisms to read and write data. The FHIR APIs are protected by an Authorization Server which authenticates and authorizes users or systems prior to accessing the data.

Backend Services App: A system that resides within the clinical care setting and interacts with the EHRs, Data Marts and the systems used by the researchers. The system will use specific Knowledge Artifact resources to identify when data has to be extracted, when data has to be queried and how results have to be returned.

Trust Service Provider: Trust Service Provider affords capabilities that can be used to translate data between different models, map terminologies between different data models, pseudonymize, anonymize, de-identify, hash, or re-link data that is submitted to public health and/or research organizations. These capabilities are called Trust Services. Trust Services are used, when appropriate, by the Backend Service App.

Data Mart: A system that will hold data that is to be accessed by researchers. Typically, researchers do not access the EHR or operational data stores directly as they are used for clinical operations. In order to facilitate access to researchers’ healthcare organizations populate Data Marts that can be accessed by researchers. The data models (e.g., PCORnet CDM, i2b2, OMOP, Sentinel, FHIR) that are used to store the data may depend on the types of research studies.

Research Abstract Model

Figure 1. Research Use Case Abstract Model

Use Case User Stories and Diagrams

Preconditions

IRB Approval

User Stories

User Story 1 – Onboard a New Research Data Partner

As a research network administrator, there is a need to onboard research data partners to join the research network and contribute data that can be used for research. The data partners extract, transform, and load (ETL) the data from data sources such as Electronic Health Records (EHRs), Health Information Exchanges (HIEs), or other data repositories (e.g., clinical data warehouses). These processes (e.g., ETL) when standardized will expedite onboarding processes and deliver better quality data at a lower cost.

Table 1: Onboard a New Data Partner Workflow

Step	Actor	Role	Activity	Input(s)	Output(s)
1	Backend Services App	Requestor	Request artifacts from the KAR	Artifact identifiers	KAR request
2	KAR	Query Responder	Return the requested artifacts	KAR request	Response with artifacts
3	Backend Services App	Data Receiver	Receive and validate artifacts	Artifacts	Validated artifacts
4	Backend Services App	Provisioner	Apply parameters, value sets, etc. from artifacts	Validated artifacts	Updated BSA logic
5	Backend Services App	Provisioner	Send value sets with trigger codes to the EHR System	Validated artifacts	FHIR bundle
6	EHR System	Data Receiver	Receive and validate FHIR bundle	FHIR bundle	FHIR validated bundle
7	EHR System	Provisioner	Instantiate triggers based on BSA value sets	FHIR validated bundle	Triggers

User Story 1 – Onboard a New Research Data Partner Activity Diagram

Figure 2 below illustrates the flow of events and information between the actors for the Onboard a New Research Data Partner workflow.

Figure 2: Onboard a New Research Data Partner User Story Activity Diagram

User Story 1 – Onboard a New Research Data Partner Sequence Diagram

Figure 3 below represents the interactions between actors in the sequential order that they occur in the onboarding of a new research data partner workflow.

Figure 3: Onboard a New Research Data Partner User Story Sequence Diagram

User Story 2 – Accessing Additional Data for a Specific Research Question (future consideration)

Once a data partner joins a research network and contributes data, a researcher should be able to perform queries to filter specific data sets and analyze the data. Currently researchers can query data marts on a limited basis because of the variations in the data models (e.g., National Patient-Centered Clinical Research Network (PCORnet) Clinical Data Management (CDM), Informatics for Integrating Biology & the Bedside (i2b2), Observational Medical Outcomes Partnership (OMOP), FHIR Resources) and the technologies (e.g., Microsoft Structured Query Language (SQL), Postgres SQL, Statistical Analysis System (SAS)) used to host the data mart. Standardizing these data access mechanisms will help overcome the hurdles faced by researchers in accessing data from data partners and EHRs. NOTE: User Story #2 is out of scope for this version of the IG, it is being included only as a potential use case that could be expanded on in the future.

Research Use Case