SEM Emergency Department Diagnoses Dataset (2017-2018)

This dataset contains information about diagnoses recorded at the emergency department during healthcare contacts for patients in the SEM cohort from the Melior journal system. The data represents diagnoses linked to emergency department visits during 2017-2018.

Usage

melior_sem_ed_diagnoses()

Format

A data frame with 551,657 observations and 6 variables:

contact_id: Character. Unique identifier for each healthcare contact/encounter, serves as a foreign key to link with other datasets. Original field name: KontaktId
patient_id: Integer. Patient pseudonym identifier, serves as a foreign key to link with other patient-level data. Original field name: Alias
diagnosis_type: Character. Type of diagnosis. Values: "Huvuddiagnos"/"huvuddiagnos" (92.3%, primary diagnosis), "Bidiagnos" (6.3%, secondary diagnosis), "bidiagnos tillägg ICD10" (1.4%, secondary diagnosis ICD10 supplement). Original field name: Diagnostyp
diagnosis_code: Character. Patient diagnosis code (ICD-10 code). 7,730 unique values across the dataset. Original field name: PatientDiagnos_Kod
diagnosis_description: Character. Description of the diagnosis. 7,598 unique values across the dataset. Contains 285 NA values (0.1%). Original field name: PatientDiagnos_Beskrivning
diagnosis_modified_date: POSIXct. Date/time when the diagnosis was recorded/modified. Date range: 2013-06-12 to 2020-11-27, with 99.1% of entries from 2017-2018. Original field name: PatientDiagnos_ModifieradDatum

Source

Melior

Details

This file was extracted from the Melior electronic health record system. The original filename indicates it contains information about preliminary assessment diagnoses (PreliminärBedömningDiagnos) recorded at the emergency department (PåAkuten) during healthcare contacts (VidVårdkontakt) during 2017-2018. The diagnostic codes follow the ICD-10 coding system.

These diagnoses represent the initial assessment made at the emergency department, which may differ from the final diagnoses recorded after a complete evaluation. The dataset includes 551,657 diagnosis entries across emergency department contacts.

Note

Several fields from the original dataset have been omitted for efficiency and clarity:
- AktivitetTyp (activity_type): This field contained two values ("Akutkliniken Läk", 82% and "Akutmottagning Läk", 18%) that merely represented organizational differences between hospitals in how they record emergency department visits, with no clinical significance.
- VårdtillfälleFörDiagnos_VardformText (care_form): This field contained a single value ("Slutenvård" = Inpatient) across all records, providing no discriminative information.
Despite the dataset's 2017-2018 focus, it contains a small number of records with modification dates outside this range (0.9% of records), with some dating back to 2013 and others up to 2020
The dataset doesn't include care episode start and end dates that are present in some other diagnosis datasets - this is captured with the link to the care event (contact_id)

Examples

# Load the raw data
library(readr)
library(here)
#> here() starts at /Users/an1583jo/Documents/Forsk/SEM
ed_diagnoses_file <- "MELIOR_PreliminärBedömningDiagnosPåAkutenVidVårdkontakt_2017_2018.csv"
data <- read_delim(here("data", "raw", ed_diagnoses_file),
                   delim = "|",
                   locale = locale(encoding = "ISO-8859-1"))
#> Rows: 551657 Columns: 8
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "|"
#> chr  (6): KontaktId, AktivitetTyp, Diagnostyp, VårdtillfälleFörDiagnos_Vardf...
#> dbl  (1): Alias
#> dttm (1): PatientDiagnos_ModifieradDatum
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.