Skip to contents

This dataset contains information about diagnoses recorded after discharge for patients in the SEM cohort from the Melior journal system. The data represents diagnoses with dates up to December 31, 2019.

Usage

melior_post_sem_diagnoses_all()

Format

A data frame with 11,681,602 observations and 8 variables:

contact_id

Character. Unique identifier for each healthcare contact/encounter, serves as a foreign key to link with other datasets. Original field name: KontaktId

patient_id

Integer. Patient pseudonym identifier, serves as a foreign key to link with other patient-level data. Original field name: Alias

activity_type

Character. Type of healthcare activity or note. 287 unique values. Most common: "Epikris, tvärprofessionell" (19.7%), "Akutkliniken Läk" (10.3%), "Mott Ögon Läk" (5.2%). Original field name: AktivitetTyp

diagnosis_type

Character. Type of diagnosis. 8 unique values with main categories: "Huvuddiagnos"/"huvuddiagnos" (65.5%, primary diagnosis), "Bidiagnos" (32.7%, secondary diagnosis), "bidiagnos tillägg ICD10" (1.5%, secondary diagnosis ICD10 supplement). Original field name: Diagnostyp

diagnosis_code

Character. Patient diagnosis code (ICD-10 code). 12,902 unique values across the dataset. Original field name: PatientDiagnos_Kod

diagnosis_description

Character. Description of the diagnosis. 12,421 unique values across the dataset. Contains 10,386 NA values (0.1%). Original field name: PatientDiagnos_Beskrivning

diagnosis_modified_date

POSIXct. Date/time when the diagnosis was recorded/modified. Date range: 2017-01-01 to 2019-12-31. Distribution by year: 2019 (42.1%), 2018 (40.7%), 2017 (17.2%). Original field name: PatientDiagnos_ModifieradDatum

care_form

Character. Form of care ("Slutenvård" = Inpatient, "Öppenvård" = Outpatient). Distribution: Outpatient (54.8%), Inpatient (45.2%). Original field name: VårdtillfälleFörDiagnos_VardformText

Source

Melior

Details

This file was extracted from the Melior electronic health record system. The original filename indicates it contains information about patient diagnoses (PatientDiagnoser) performed after discharge (Efter_Vardkontakt) with discharge dates (UtskrivningDatum) up to December 31, 2019 (Till_20191231). The diagnostic codes follow the ICD-10 coding system.

Note

  • Unlike some other diagnosis datasets, this dataset does not include care episode start and end dates

  • The care_form field contains two values: "Öppenvård" (Outpatient, 54.8%) and "Slutenvård" (Inpatient, 45.2%)

  • The diagnosis_description field has 10,386 missing values (0.1% of the dataset) but since the diagnosis_type has no missing this is likely no issue.

  • Several fields from the original dataset have been omitted for efficiency:

    • AktivitetTermId: Numeric identifier that in all but one case corresponded to activity_type

    • TermId: Numeric identifier that in all but one case corresponded to diagnosis_type

  • This dataset represents diagnoses recorded after the initial SEM cohort contact/discharge and tracks them through the end of 2019

Examples