Getting Started
We offer two versions of our platform:
The DRH Cloud and DRH Edge platforms cater to distinct user needs, providing tailored solutions for effective data management and analysis. This guide shall help users determine when and why to choose each platform:
-
Cloud Version:
The cloud version shall enable users to collaborate with fellow researchers, share de-identified datasets, and assess patient outcomes on a global platform. Features such as study creation, data upload, and metric visualization will be available in the future as part of ongoing development. Currently, publicly archived studies can be accessed here. -
Local Version:
DRH Edge is a locally customizable tool that is still in the development phase. It requires adaptation based on each end user’s specific file and folder structure to ensure optimal functionality.
If a user plans to use the Edge version, we kindly request that they provide their file and folder structure in advance. This will allow us to adapt the SQL and create a tailored package.
Choosing Between DRH Edge and DRH Cloud
- Use DRH Edge for localized, secure data handling, particularly when working with sensitive datasets or needing offline data exploration.
- Opt for DRH Cloud when collaborating with teams or partners on non-sensitive data that requires scalability and centralized access.
Note: In case of any technical queires please create a discussion with the issue in https://github.com/diabetes-research/help-desk-non-phi/discussions
Data Submission Process
If you are a researcher willing to submit your CGM data, please follow the steps below:
1. Download and review the MoU
Before you register with us and submit your data, please review our MoU. This document outlines the kind of data expected, secure data sharing methods, data ownership details, and the benefits you, as a researcher, will receive.
2. Identify the mode of data submission
There are three ways to provide your data:
-
Upload Anonymized Files:
You can upload anonymized files, organized according to the CGM file structure specified, to the provided SFTP folder. -
Request a Customized DRH Edge Package(customization required):
If you require a customized DRH Edge package, please provide the details of your file structure, column names to be anonymized or de-identified, and other relevant information. Based on this, a tailored package will be developed for you. This package must be tested before it can be used. Once the testing is complete and the package is verified, it can be utilized. The database generated using the DRH Edge package can then be uploaded to the SFTP folder. -
Cloud Version (Under Development):
The cloud version, currently in development, shall allow researchers in future to create studies, upload data, and view metrics. These studies can be shared within your organization or made publicly visible.
3. Request for SFTP account based on data submission mode
If you have decided to submit files or a database through SFTP, kindly request an SFTP account:
- As a new user to DRH, you must send a request to researchhub-help@diabetestechnology.org, along with the name of your organization, to have an SFTP account set up. This account will enable you to securely transfer your files to DRH.
- Based on your request, a new SFTP user account will be created for you.
- Once the folder setup and permissions are ready, you will be notified via the email address from which you submitted your request.
- After receiving your credentials, you can log in at the SFTP site using your credentials.
- You will be able to view the folder created for you, where you can upload your CGM data.
Note: File uploads via SFTP will continue until the transition to a web-based UI occurs in the near future.
4. Organize your CGM data in the standard format provided
In order to collect and unify data received from various disparate sources, we have a defined a standard format in which you will have to organize your data before you can upload to SFTP. Your CGM data will have to be organized into the following files. The structure is designed considering that this is a research study.Please note that you will have to use the exact same file names as shown below.
View the expected file structure
a. cgm_tracing_0000n.csv
File Description: The “cgm_tracing_0000n” file typically contains a record of continuous glucose monitoring (CGM) data collected over a specific period of time for a patient from various CGM devices. This is the raw CGM data obtained directly from the CGM device.
Note: You can add a suffix of your choice after cgm_tracing to denote the multiple tracing files. The above suggested suffix ” _0000n ” is only a recommendation.
Note: Use comma delimiter in all csv files.
Accepted CGM Data Formats:
The List below provides information about the different types of CGM data sources, along with their respective manufacturers, sensors, and data formats for each platform:
Manufacturer | Sensors | Data Format (Platform) | File Type |
---|---|---|---|
Abbott | Libre2,Libre 3 | Freestyle Libre | CSV |
Dexcom | G6, G7, Stelo | Clarity | CSV |
Medtronic | Carelink | CSV | |
Senseonics | CSV | ||
Tidepool | any | Tidepool | CSV |
Glooko | any | Glooko | CSV |
For each patient, a cgm_tracing file containing CGM data can be provided. The metadata associated with each cgm_tracing file can be linked in the cgm_file_metadata file. The number of cgm_tracing files will increase based on the number of patients included in the study.
b. cgm_file_metadata.csv
File Description: Metadata associated with CGM data files. Given below are the columns that provide additional information about the data in the raw cgm_tracing.csv file. Also, a researcher can choose to add custom columns in addition to the columns given below.
Field | Description |
---|---|
metadata_id | A unique identifier for the record |
devicename | Name of the device |
device_id | Unique identifier for the device |
source_platform | Platform or system from which data originated |
patient_id | Unique identifier for the patient |
file_name | Name of the uploaded file |
file_format | Format of the uploaded file (e.g., CSV, excel) |
file_upload_date | Date when the file was uploaded |
data_start_date | Start date of the data period covered by the file |
data_end_date | End date of the data period covered by the file |
map_field_of_cgm_date | Specifies the column in the file that maps to CGM date time |
map_field_of_cgm_value | Specifies the column in the file that maps to CGM values |
study_id | Unique identifier for the study associated with the data |
c. participant.csv
File Description: Demographic information of study participants/patients.
Field | Description |
---|---|
participant_id | Unique identifier for the participant/patient |
study_id | Unique identifier for the study |
site_id | Identifier for the site where participant is enrolled |
diagnosis_icd | Diagnosis code based on International Classification of Diseases (ICD) system |
med_rxnorm | Medication code based on RxNorm system |
treatment_modality | Modality of treatment for the participant |
gender | Gender of the participant |
race_ethnicity | Race and ethnicity of the participant |
age | Age of the participant |
bmi | Body Mass Index (BMI) of the participant |
baseline_hba1c | Baseline Hemoglobin A1c level of the participant |
diabetes_type | Type of diabetes diagnosed for the participant |
study_arm | Arm or group to which the participant is assigned in the study |
d. site.csv
File Description: “Site” typically refers to the physical location or locations where the study is being conducted or where participants are recruited. This file shall contain information related to the site in the context of studying CGM data, including details about the specific facilities, clinics, or hospitals involved in the research, as well as any pertinent characteristics or attributes of these locations.
Field | Description |
---|---|
study_id | Unique identifier for the study |
site_id | Unique identifier for the site |
site_name | Name of the site |
site_type | Type or category of the site (e.g., hospital, clinic) |
e. study.csv
File Description: The study file typically contains information about a specific research study.
Field | Description |
---|---|
study_id | Unique identifier for the study |
study_name | Name or title of the study |
start_date | Date when the study commences |
end_date | Date when the study concludes |
treatment_modalities | Different modalities or interventions used in the study |
funding_source | Source(s) of funding for the study |
nct_number | ClinicalTrials.gov identifier for the study |
study_description | Description about Study |
f. investigator.csv
File Description: Details of investigators/researchers involved in the study.
Field | Description |
---|---|
investigator_id | The ID of the investigator / researcher |
investigator_name | Name of the Researcher |
Researcher email | |
institution_id | Unique identifier for the institution |
study_id | ID for the study associated with the researcher |
g. institution.csv
File Description: This file contains information about institutions involved in a study.
Field | Description |
---|---|
institution_id | Unique identifier for the institution |
institution_name | Name of the institution |
city | City where the institution is located |
state | State where the institution is located |
country | Country where the institution is located |
h. lab.csv
File Description : This file contains information about laboratories involved in a study.
Field | Description |
---|---|
lab_id | Unique identifier for the laboratory |
lab_name | Name of the laboratory |
lab_pi | Principal investigator associated with the lab |
institution_id | Unique identifier of the institution the lab belongs to |
study_id | Unique identifier for the study |
i. author.csv
File Description: This file contains information about authors involved in a study publication.
Field | Description |
---|---|
author_id | Unique identifier for the author |
name | Name of the author |
Email of the author | |
investigator_id | Unique identifier of the investigator the author is associated with |
study_id | Unique identifier for the study |
j. publication.csv
File Description: This file contains information about publications resulting from a study.
Field | Description |
---|---|
publication_id | Unique identifier for the publication |
publication_title | Title of the publication |
digital_object_identifier | Identifier for the digital object associated with the publication |
publication_site | Publishing site |
study_id | Unique identifier for the study |
5. Submit the CGM data
After selecting the appropriate data submission mode, please follow these steps to submit your data:
-
Prepare Your Data/Database:
Rename the database or study files folder to reflect your study name for easy identification.
If uploading an Edge-generated database, please refer to the DRH Edge custom package setup and guidelines. -
Upload to SFTP:
Upload the renamed database or study folder to the SFTP folder assigned to you. -
Access Your SFTP Account:
Log in to your SFTP account using the credentials provided by DRH via email.
Important Note: Your data is secure and accessible only to you and DRH.
DRH Cloud
The DRH Cloud platform is designed for collaboration and scalability, making it ideal for users who:
- Require seamless collaboration across teams, institutions, or external partners.
- Are working with non-sensitive data that can be securely stored and processed on external servers.
- Need scalable computing resources to handle large datasets efficiently, benefiting from robust cloud infrastructure.
DRH Edge
The DRH Edge platform is designed for localized data handling and is ideal for the following scenarios:
1. Data De-identification
- Use when sensitive information, such as participant demographics or CGM data, requires anonymization before external sharing or publication.
- Suitable for datasets containing personal details, including patient information, requiring compliance with privacy regulations and organizational data-sharing policies.
2. Local Data Preview and Analysis
- Allows users to preview study data, analyze metrics, and view charts directly on their local machines.
- Ensures secure data exploration in a controlled, offline environment without the need for external data sharing.
Using DRH Edge Software
If you prefer to use the DRH Edge software for de-identification and anonymization, please follow these steps:
-
Meet the Edge Usage Prerequisites:
- Provide the DRH Technical (Developer) team with detailed information about the file structure and column patterns in advance.
- This information is essential for developing the SQL scripts and creating a tailored package suitable for the user’s specific needs.
- Ensure the required files (marked in green ) are included in the file preparation.
-
Custom Package Preparation:
- The DRH technical team will develop a bespoke package tailored to your organization’s file patterns.
- Share detailed requirements for de-identification columns with the technical team.
-
Verification and Upload:
- After testing the tailored package and generating the database, verify the database.
Note: - - After processing through DRH Edge, the generated database can be verified by edge user and then be uploaded to the SFTP folder for integrating in the cloud version.
- Once verified, upload the database to the designated SFTP folder.
- After testing the tailored package and generating the database, verify the database.
DRH Edge UI Setup and Data Transformation
In this guide, we will explain how to set up the DRH Edge tool and perform data transformation using a tailored sample package for UVA.
Current DRH EDGE Sample data Package
- The sample DRH Edge package is modeled on the UVA file structure, as previously provided to the DRH Netspective team through the SFTP folder.
- Users requiring a customized version should share their file structure and details about anonymization needs (e.g., specific columns to de-identify).
The DRH Edge tool allows you to securely convert your CSV files, perform de-identification, and conduct verification and validation (V&V) processes all within your own environment. You can view the results directly on your local system.
Requirements for Previewing the Edge UI:
-
Surveilr Tool (Use the latest version unless specified).
Note: The compatibility of the surveilr tool with the operating system (OS) should be periodically tested to ensure it continues to function as expected. -
Deno Runtime (requires Deno 2.0)
Follow the Deno Installation Guide for step-by-step instructions.
If Deno is already installed, upgrade to Deno 2.0 by running the following command as an administrator:deno upgrade
Getting Started
Step 1: Navigate to the Folder Containing the Files
- Open the command prompt and navigate to the directory with your files.
- Command:
cd <folderpath>
- Example:
cd D:/DRH-Files
Step 2: Download Surveilr
- Follow the installation instructions at the Surveilr Installation Guide.
- Download the latest version of
surveilr
from Surveilr Releases and place it in the folder.
Step 3: Verify the Tool Version
- Run the command
surveilr --version
in the command prompt or.\surveilr --version
in PowerShell. - If the tool is installed correctly, it will display the version number.
The folder structure should look like this:
surveilr.exe
study-files/
├── cgm_file_metadata.csv
├── participant.csv
├── cgm_tracing_001
├── cgm_tracing_002
├── cgm_tracing_003
├── cgm_tracing_004
├── cgm_tracing_005
├── cgm_tracing_006
└── ...
Step 4: Execute the Commands Below
-
Clear the cache by running the following command:
deno cache --reload https://raw.githubusercontent.com/surveilr/www.surveilr.com/main/lib/service/diabetes-research-hub/drhctl.ts
-
After clearing the cache, run the following single execution command:
deno run -A https://raw.githubusercontent.com/surveilr/www.surveilr.com/main/lib/service/diabetes-research-hub/drhctl.ts 'foldername'
- Replace
foldername
with the name of your folder containing all CSV files to be converted.
Example:
deno run -A https://raw.githubusercontent.com/surveilr/www.surveilr.com/main/lib/service/diabetes-research-hub/drhctl.ts study-files
- After the above command completes execution, launch your browser and go to
http://localhost:9000/drh/index.sql.
This method provides a streamlined approach to complete the process and view the results quickly.
- Replace
Step 5: Verify the Verification and Validation Results in the UI
- Check the following section in the UI and follow the steps as shown in the second image.
Tools for CSV Files Validation
We recommend the following third-party open-source tools to help validate whether your files adhere to the structure described below:
Data Curator
Data Curator is a lightweight desktop data editor designed to help describe, validate, and share open data. It offers a user-friendly graphical interface for editing tabular data while ensuring adherence to data standards.
Key Features:
- Schema Editing: Easily modify Frictionless JSON schemas to suit your project’s requirements.
- Load and Preview Data: Visualize and inspect CSV files in a user-friendly interface.
- Validate Against Schema: Perform schema validation using the Frictionless JSON specification.
- Edit and Save: Correct errors directly in the application and save validated data.
- Metadata Management: Add metadata to improve data usability.
- Export Options: Share validated datasets as clean CSV files or complete data packages.
Open Data Editor (ODE)
The Open Data Editor (ODE) is an online tool for non-technical data users to explore, validate, and detect errors in tabular datasets. It provides an intuitive web interface for identifying and correcting issues in open data.
Key Features:
- Online Access: No installation required—access it directly through your browser.
- Error Detection: Quickly identify errors in table formatting and data values.
- Schema Validation: Validate datasets against predefined schemas, including Frictionless JSON schema.
- Interactive Editing: Fix errors and inconsistencies in real-time.
- Export Options: Save corrected files for further use or sharing.