melior.Rmd
Melior Data Overview
This vignette provides a comprehensive overview of the Melior electronic health record data used in the SEM cohort study. The data is organized into three temporal categories:
- Pre-SEM Data: Data collected before patient inclusion in the SEM cohort
- SEM Data: Data collected during the SEM cohort contact/hospitalization
- Post-SEM Data: Data collected after discharge from the SEM cohort contact
Each dataset follows standardized variable naming conventions and processing procedures as outlined in the Variable Naming Standards.
Pre-SEM Data
These datasets capture patient history prior to the index SEM cohort contact, providing context for their clinical presentation.
Dataset | Description | Key Variables | Timeframe |
---|---|---|---|
Historical Diagnoses | Diagnoses recorded within 5 years prior to SEM contact | diagnosis_code, diagnosis_type, care_form | 5 years before 2017-2018 contacts |
Pre-SEM Medications | Medication prescriptions and administrations before SEM contact | medication_atc_code, prescription_start_date, medication_name | 1 year before arrival |
SEM Cohort Data
These datasets capture clinical information during the index SEM cohort contact/hospitalization period.
Dataset | Description | Key Variables | Timeframe |
---|---|---|---|
SEM Diagnoses | All diagnoses recorded during SEM contact | diagnosis_code, diagnosis_type, care_episode_start/end | 2017-2018 contacts |
ED Diagnoses | Preliminary diagnoses recorded in the emergency department | diagnosis_code, diagnosis_type | 2017-2018 ED visits |
SEM Interventions | Healthcare interventions performed during SEM contact | intervention_code, intervention_description, care_episode_start/end | 2017-2018 contacts |
Laboratory Tests (24h) | Lab test results obtained within 24 hours of arrival | lab_test_name, test_result_value, test_result_unit | First 24 hours from arrival |
Medications (24h) | Medication prescriptions and administrations within 24 hours of arrival | medication_atc_code, medication_name, administration_method | First 24 hours from arrival |
Post-SEM Data
These datasets capture healthcare utilization and outcomes after discharge from the index SEM cohort contact.
Dataset | Description | Key Variables | Timeframe |
---|---|---|---|
Post-SEM Care Episodes | Inpatient care episodes after discharge | care_episode_start/end, ward, care_episode_duration_days | After discharge through Dec 31, 2019 |
Post-SEM Diagnoses (30d) | Diagnoses recorded within 30 days after SEM contact | diagnosis_code, diagnosis_type, care_form | 30 days after 2017-2018 contacts |
Post-SEM Diagnoses (All) | All diagnoses recorded after discharge | diagnosis_code, diagnosis_type, care_form | After discharge through Dec 31, 2019 |
Post-SEM Inpatient Hours | Duration of inpatient care in hours | contact_id, patient_id, duration_hours | Within 30 days after discharge |
Post-SEM Inpatient Stays | Inpatient hospital stays after discharge | duration_minutes, duration_days | After discharge through Dec 31, 2019 |
Post-SEM Interventions (30d) | Healthcare interventions performed within 30 days after SEM contact | intervention_code, intervention_description, care_form | 30 days after 2017-2018 contacts |
Post-SEM Interventions (All) | All healthcare interventions performed after discharge | intervention_code, intervention_description, care_form | After discharge through Dec 31, 2019 |
Return to ED | First return to the emergency department after discharge | discharged_to, admission_ward, discharge_date | Within 30 days after discharge |
Data Relationships
The datasets can be linked through common identifiers:
-
patient_id
: Links all records for a specific patient across datasets -
contact_id
: Links records related to a specific healthcare contact/encounter
Data Characteristics
Temporal Coverage
- Pre-SEM data: Primarily covers 2012-2017
- SEM data: Focuses on 2017-2018
- Post-SEM data: Covers 2017-2019
Data Types
The Melior datasets include several types of clinical information:
- Diagnoses: Using ICD-10 codes
- Interventions/Procedures: Using KVÅ codes
- Laboratory Tests: With values and reference ranges
- Medications: With ATC codes and administration details
- Care Episodes: With timestamps and duration information
Data Processing Notes
- All datasets have been standardized to use consistent English variable names
- Original Swedish variable names are preserved in the documentation
- Swedish categorical values (e.g., care forms) have been translated to English in some datasets
- Dates and times are stored in POSIXct format
- Numeric fields have been validated and corrected where needed
- Missing or erroneous data is documented in each dataset’s documentation
Usage Guide
For most analyses, the recommended approach is to:
- Start with the SEM cohort data that defines your study population
- Link to pre-SEM data to understand patient history and risk factors
- Link to post-SEM data to analyze outcomes and healthcare utilization
Each dataset is accompanied by detailed documentation explaining the variables, data quality issues, and examples for proper usage.