Welcome to the Diabetes Research Hub

Getting Started

We offer two versions of our platform:

The DRH Cloud and DRH Edge platforms cater to distinct user needs, providing tailored solutions for effective data management and analysis. This guide shall help users determine when and why to choose each platform:

  • Cloud Version:
    The cloud version shall enable users to collaborate with fellow researchers, share de-identified datasets, and assess patient outcomes on a global platform. Features such as study creation, data upload, and metric visualization will be available in the future as part of ongoing development. Currently, publicly archived studies can be accessed here.

  • Local Version:
    DRH Edge is a locally customizable tool that is still in the development phase. It requires adaptation based on each end user’s specific file and folder structure to ensure optimal functionality.
    If a user plans to use the Edge version, we kindly request that they provide their file and folder structure in advance. This will allow us to adapt the SQL and create a tailored package.

Choosing Between DRH Edge and DRH Cloud

  • Use DRH Edge for localized, secure data handling, particularly when working with sensitive datasets or needing offline data exploration.
  • Opt for DRH Cloud when collaborating with teams or partners on non-sensitive data that requires scalability and centralized access.

Note: In case of any technical queires please create a discussion with the issue in https://github.com/diabetes-research/help-desk-non-phi/discussions


Data Submission Process

If you are a researcher willing to submit your CGM data, please follow the steps below:

1. Download and review the MoU

Before you register with us and submit your data, please review our MoU. This document outlines the kind of data expected, secure data sharing methods, data ownership details, and the benefits you, as a researcher, will receive.

2. Identify the mode of data submission

There are three ways to provide your data:

  1. Upload Anonymized Files:
    You can upload anonymized files, organized according to the CGM file structure specified, to the provided SFTP folder.

  2. Request a Customized DRH Edge Package(customization required):
    If you require a customized DRH Edge package, please provide the details of your file structure, column names to be anonymized or de-identified, and other relevant information. Based on this, a tailored package will be developed for you. This package must be tested before it can be used. Once the testing is complete and the package is verified, it can be utilized. The database generated using the DRH Edge package can then be uploaded to the SFTP folder.

  3. Cloud Version (Under Development):
    The cloud version, currently in development, shall allow researchers in future to create studies, upload data, and view metrics. These studies can be shared within your organization or made publicly visible.

3. Request for SFTP account based on data submission mode

If you have decided to submit files or a database through SFTP, kindly request an SFTP account:

  • As a new user to DRH, you must send a request to researchhub-help@diabetestechnology.org, along with the name of your organization, to have an SFTP account set up. This account will enable you to securely transfer your files to DRH.
  • Based on your request, a new SFTP user account will be created for you.
  • Once the folder setup and permissions are ready, you will be notified via the email address from which you submitted your request.
  • After receiving your credentials, you can log in at the SFTP site using your credentials.
  • You will be able to view the folder created for you, where you can upload your CGM data.

Note: File uploads via SFTP will continue until the transition to a web-based UI occurs in the near future.

4. Organize your CGM data in the standard format provided

In order to collect and unify data received from various disparate sources, we have a defined a standard format in which you will have to organize your data before you can upload to SFTP. Your CGM data will have to be organized into the following files. The structure is designed considering that this is a research study.Please note that you will have to use the exact same file names as shown below.

View the expected file structure

a. cgm_tracing_0000n.csv

File Description: The “cgm_tracing_0000n” file typically contains a record of continuous glucose monitoring (CGM) data collected over a specific period of time for a patient from various CGM devices. This is the raw CGM data obtained directly from the CGM device.

Note: You can add a suffix of your choice after cgm_tracing to denote the multiple tracing files. The above suggested suffix ” _0000n ” is only a recommendation.

Note: Use comma delimiter in all csv files.

Accepted CGM Data Formats:

The List below provides information about the different types of CGM data sources, along with their respective manufacturers, sensors, and data formats for each platform:

ManufacturerSensorsData Format (Platform)File Type
AbbottLibre2,Libre 3Freestyle LibreCSV
DexcomG6, G7, SteloClarityCSV
MedtronicCarelinkCSV
SenseonicsCSV
TidepoolanyTidepoolCSV
GlookoanyGlookoCSV

For each patient, a cgm_tracing file containing CGM data can be provided. The metadata associated with each cgm_tracing file can be linked in the cgm_file_metadata file. The number of cgm_tracing files will increase based on the number of patients included in the study.

b. cgm_file_metadata.csv

File Description: Metadata associated with CGM data files. Given below are the columns that provide additional information about the data in the raw cgm_tracing.csv file. Also, a researcher can choose to add custom columns in addition to the columns given below.

FieldDescription
metadata_idA unique identifier for the record
devicenameName of the device
device_idUnique identifier for the device
source_platformPlatform or system from which data originated
patient_idUnique identifier for the patient
file_nameName of the uploaded file
file_formatFormat of the uploaded file (e.g., CSV, excel)
file_upload_dateDate when the file was uploaded
data_start_dateStart date of the data period covered by the file
data_end_dateEnd date of the data period covered by the file
map_field_of_cgm_dateSpecifies the column in the file that maps to CGM date time
map_field_of_cgm_valueSpecifies the column in the file that maps to CGM values
study_idUnique identifier for the study associated with the data

c. participant.csv

File Description: Demographic information of study participants/patients.

FieldDescription
participant_idUnique identifier for the participant/patient
study_idUnique identifier for the study
site_idIdentifier for the site where participant is enrolled
diagnosis_icdDiagnosis code based on International Classification of Diseases (ICD) system
med_rxnormMedication code based on RxNorm system
treatment_modalityModality of treatment for the participant
genderGender of the participant
race_ethnicityRace and ethnicity of the participant
ageAge of the participant
bmiBody Mass Index (BMI) of the participant
baseline_hba1cBaseline Hemoglobin A1c level of the participant
diabetes_typeType of diabetes diagnosed for the participant
study_armArm or group to which the participant is assigned in the study

d. site.csv

File Description: “Site” typically refers to the physical location or locations where the study is being conducted or where participants are recruited. This file shall contain information related to the site in the context of studying CGM data, including details about the specific facilities, clinics, or hospitals involved in the research, as well as any pertinent characteristics or attributes of these locations.

FieldDescription
study_idUnique identifier for the study
site_idUnique identifier for the site
site_nameName of the site
site_typeType or category of the site (e.g., hospital, clinic)

e. study.csv

File Description: The study file typically contains information about a specific research study.

FieldDescription
study_idUnique identifier for the study
study_nameName or title of the study
start_dateDate when the study commences
end_dateDate when the study concludes
treatment_modalitiesDifferent modalities or interventions used in the study
funding_sourceSource(s) of funding for the study
nct_numberClinicalTrials.gov identifier for the study
study_descriptionDescription about Study

|

f. investigator.csv

File Description: Details of investigators/researchers involved in the study.

FieldDescription
investigator_idThe ID of the investigator / researcher
investigator_nameName of the Researcher
emailResearcher email
institution_idUnique identifier for the institution
study_idID for the study associated with the researcher

g. institution.csv

File Description: This file contains information about institutions involved in a study.

FieldDescription
institution_idUnique identifier for the institution
institution_nameName of the institution
cityCity where the institution is located
stateState where the institution is located
countryCountry where the institution is located

h. lab.csv

File Description : This file contains information about laboratories involved in a study.

FieldDescription
lab_idUnique identifier for the laboratory
lab_nameName of the laboratory
lab_piPrincipal investigator associated with the lab
institution_idUnique identifier of the institution the lab belongs to
study_idUnique identifier for the study

i. author.csv

File Description: This file contains information about authors involved in a study publication.

FieldDescription
author_idUnique identifier for the author
nameName of the author
emailEmail of the author
investigator_idUnique identifier of the investigator the author is associated with
study_idUnique identifier for the study

j. publication.csv

File Description: This file contains information about publications resulting from a study.

FieldDescription
publication_idUnique identifier for the publication
publication_titleTitle of the publication
digital_object_identifierIdentifier for the digital object associated with the publication
publication_sitePublishing site
study_idUnique identifier for the study

5. Submit the CGM data

After selecting the appropriate data submission mode, please follow these steps to submit your data:

  1. Prepare Your Data/Database:
    Rename the database or study files folder to reflect your study name for easy identification.
    If uploading an Edge-generated database, please refer to the DRH Edge custom package setup and guidelines.

  2. Upload to SFTP:
    Upload the renamed database or study folder to the SFTP folder assigned to you.

  3. Access Your SFTP Account:
    Log in to your SFTP account using the credentials provided by DRH via email.

Important Note: Your data is secure and accessible only to you and DRH.


DRH Cloud

The DRH Cloud platform is designed for collaboration and scalability, making it ideal for users who:

  • Require seamless collaboration across teams, institutions, or external partners.
  • Are working with non-sensitive data that can be securely stored and processed on external servers.
  • Need scalable computing resources to handle large datasets efficiently, benefiting from robust cloud infrastructure.

DRH Edge

The DRH Edge platform is designed for localized data handling and is ideal for the following scenarios:

1. Data De-identification

  • Use when sensitive information, such as participant demographics or CGM data, requires anonymization before external sharing or publication.
  • Suitable for datasets containing personal details, including patient information, requiring compliance with privacy regulations and organizational data-sharing policies.

2. Local Data Preview and Analysis

  • Allows users to preview study data, analyze metrics, and view charts directly on their local machines.
  • Ensures secure data exploration in a controlled, offline environment without the need for external data sharing.

Using DRH Edge Software

If you prefer to use the DRH Edge software for de-identification and anonymization, please follow these steps:

  1. Meet the Edge Usage Prerequisites:

    • Provide the DRH Technical (Developer) team with detailed information about the file structure and column patterns in advance.
    • This information is essential for developing the SQL scripts and creating a tailored package suitable for the user’s specific needs.
    • Ensure the required files (marked in green ) are included in the file preparation.
  2. Custom Package Preparation:

    • The DRH technical team will develop a bespoke package tailored to your organization’s file patterns.
    • Share detailed requirements for de-identification columns with the technical team.
  3. Verification and Upload:

    • After testing the tailored package and generating the database, verify the database.
      Note: -
    • After processing through DRH Edge, the generated database can be verified by edge user and then be uploaded to the SFTP folder for integrating in the cloud version.
    • Once verified, upload the database to the designated SFTP folder.

DRH Edge UI Setup and Data Transformation

In this guide, we will explain how to set up the DRH Edge tool and perform data transformation using a tailored sample package for UVA.

Current DRH EDGE Sample data Package

  • The sample DRH Edge package is modeled on the UVA file structure, as previously provided to the DRH Netspective team through the SFTP folder.
  • Users requiring a customized version should share their file structure and details about anonymization needs (e.g., specific columns to de-identify).

The DRH Edge tool allows you to securely convert your CSV files, perform de-identification, and conduct verification and validation (V&V) processes all within your own environment. You can view the results directly on your local system.

Requirements for Previewing the Edge UI:
  1. Surveilr Tool (Use the latest version unless specified).
    Note: The compatibility of the surveilr tool with the operating system (OS) should be periodically tested to ensure it continues to function as expected.

  2. Deno Runtime (requires Deno 2.0)
    Follow the Deno Installation Guide for step-by-step instructions.
    If Deno is already installed, upgrade to Deno 2.0 by running the following command as an administrator:

    deno upgrade
Getting Started

Step 1: Navigate to the Folder Containing the Files

  • Open the command prompt and navigate to the directory with your files.
  • Command: cd <folderpath>
  • Example: cd D:/DRH-Files

Step 2: Download Surveilr

Step 3: Verify the Tool Version

  • Run the command surveilr --version in the command prompt or .\surveilr --version in PowerShell.
  • If the tool is installed correctly, it will display the version number.

The folder structure should look like this:

surveilr.exe
study-files/
  ├── cgm_file_metadata.csv
  ├── participant.csv
  ├── cgm_tracing_001
  ├── cgm_tracing_002
  ├── cgm_tracing_003
  ├── cgm_tracing_004
  ├── cgm_tracing_005
  ├── cgm_tracing_006
  └── ...

Step 4: Execute the Commands Below

  1. Clear the cache by running the following command:

    deno cache --reload https://raw.githubusercontent.com/surveilr/www.surveilr.com/main/lib/service/diabetes-research-hub/drhctl.ts
  2. After clearing the cache, run the following single execution command:

    deno run -A https://raw.githubusercontent.com/surveilr/www.surveilr.com/main/lib/service/diabetes-research-hub/drhctl.ts 'foldername'
    • Replace foldername with the name of your folder containing all CSV files to be converted.

    Example:

    deno run -A https://raw.githubusercontent.com/surveilr/www.surveilr.com/main/lib/service/diabetes-research-hub/drhctl.ts study-files

    This method provides a streamlined approach to complete the process and view the results quickly.

Step 5: Verify the Verification and Validation Results in the UI

  • Check the following section in the UI and follow the steps as shown in the second image.

vv-image

vv-steps-image


Tools for CSV Files Validation

We recommend the following third-party open-source tools to help validate whether your files adhere to the structure described below:

Data Curator

Data Curator is a lightweight desktop data editor designed to help describe, validate, and share open data. It offers a user-friendly graphical interface for editing tabular data while ensuring adherence to data standards.

Key Features:

  • Schema Editing: Easily modify Frictionless JSON schemas to suit your project’s requirements.
  • Load and Preview Data: Visualize and inspect CSV files in a user-friendly interface.
  • Validate Against Schema: Perform schema validation using the Frictionless JSON specification.
  • Edit and Save: Correct errors directly in the application and save validated data.
  • Metadata Management: Add metadata to improve data usability.
  • Export Options: Share validated datasets as clean CSV files or complete data packages.

Open Data Editor (ODE)

The Open Data Editor (ODE) is an online tool for non-technical data users to explore, validate, and detect errors in tabular datasets. It provides an intuitive web interface for identifying and correcting issues in open data.

Key Features:

  • Online Access: No installation required—access it directly through your browser.
  • Error Detection: Quickly identify errors in table formatting and data values.
  • Schema Validation: Validate datasets against predefined schemas, including Frictionless JSON schema.
  • Interactive Editing: Fix errors and inconsistencies in real-time.
  • Export Options: Save corrected files for further use or sharing.