Skip to contents

Melior Data Overview

This vignette provides a comprehensive overview of the Melior electronic health record data used in the SEM cohort study. The data is organized into three temporal categories:

  1. Pre-SEM Data: Data collected before patient inclusion in the SEM cohort
  2. SEM Data: Data collected during the SEM cohort contact/hospitalization
  3. Post-SEM Data: Data collected after discharge from the SEM cohort contact

Each dataset follows standardized variable naming conventions and processing procedures as outlined in the Variable Naming Standards.

Pre-SEM Data

These datasets capture patient history prior to the index SEM cohort contact, providing context for their clinical presentation.

Dataset Description Key Variables Timeframe
Historical Diagnoses Diagnoses recorded within 5 years prior to SEM contact diagnosis_code, diagnosis_type, care_form 5 years before 2017-2018 contacts
Pre-SEM Medications Medication prescriptions and administrations before SEM contact medication_atc_code, prescription_start_date, medication_name 1 year before arrival

SEM Cohort Data

These datasets capture clinical information during the index SEM cohort contact/hospitalization period.

Dataset Description Key Variables Timeframe
SEM Diagnoses All diagnoses recorded during SEM contact diagnosis_code, diagnosis_type, care_episode_start/end 2017-2018 contacts
ED Diagnoses Preliminary diagnoses recorded in the emergency department diagnosis_code, diagnosis_type 2017-2018 ED visits
SEM Interventions Healthcare interventions performed during SEM contact intervention_code, intervention_description, care_episode_start/end 2017-2018 contacts
Laboratory Tests (24h) Lab test results obtained within 24 hours of arrival lab_test_name, test_result_value, test_result_unit First 24 hours from arrival
Medications (24h) Medication prescriptions and administrations within 24 hours of arrival medication_atc_code, medication_name, administration_method First 24 hours from arrival

Post-SEM Data

These datasets capture healthcare utilization and outcomes after discharge from the index SEM cohort contact.

Dataset Description Key Variables Timeframe
Post-SEM Care Episodes Inpatient care episodes after discharge care_episode_start/end, ward, care_episode_duration_days After discharge through Dec 31, 2019
Post-SEM Diagnoses (30d) Diagnoses recorded within 30 days after SEM contact diagnosis_code, diagnosis_type, care_form 30 days after 2017-2018 contacts
Post-SEM Diagnoses (All) All diagnoses recorded after discharge diagnosis_code, diagnosis_type, care_form After discharge through Dec 31, 2019
Post-SEM Inpatient Hours Duration of inpatient care in hours contact_id, patient_id, duration_hours Within 30 days after discharge
Post-SEM Inpatient Stays Inpatient hospital stays after discharge duration_minutes, duration_days After discharge through Dec 31, 2019
Post-SEM Interventions (30d) Healthcare interventions performed within 30 days after SEM contact intervention_code, intervention_description, care_form 30 days after 2017-2018 contacts
Post-SEM Interventions (All) All healthcare interventions performed after discharge intervention_code, intervention_description, care_form After discharge through Dec 31, 2019
Return to ED First return to the emergency department after discharge discharged_to, admission_ward, discharge_date Within 30 days after discharge

Data Relationships

The datasets can be linked through common identifiers:

  • patient_id: Links all records for a specific patient across datasets
  • contact_id: Links records related to a specific healthcare contact/encounter

Data Characteristics

Temporal Coverage

  • Pre-SEM data: Primarily covers 2012-2017
  • SEM data: Focuses on 2017-2018
  • Post-SEM data: Covers 2017-2019

Data Types

The Melior datasets include several types of clinical information:

  1. Diagnoses: Using ICD-10 codes
  2. Interventions/Procedures: Using KVÅ codes
  3. Laboratory Tests: With values and reference ranges
  4. Medications: With ATC codes and administration details
  5. Care Episodes: With timestamps and duration information

Data Processing Notes

  • All datasets have been standardized to use consistent English variable names
  • Original Swedish variable names are preserved in the documentation
  • Swedish categorical values (e.g., care forms) have been translated to English in some datasets
  • Dates and times are stored in POSIXct format
  • Numeric fields have been validated and corrected where needed
  • Missing or erroneous data is documented in each dataset’s documentation

Usage Guide

For most analyses, the recommended approach is to:

  1. Start with the SEM cohort data that defines your study population
  2. Link to pre-SEM data to understand patient history and risk factors
  3. Link to post-SEM data to analyze outcomes and healthcare utilization

Each dataset is accompanied by detailed documentation explaining the variables, data quality issues, and examples for proper usage.