SCB Residence DeSO Dataset (2012-2018)
scb_deso.Rd
This dataset contains information about patients' residential geographic areas according to the DeSO (Demographic Statistical Areas) classification system. The data spans from 2012 to 2018 and links patients to their residential areas for each year.
Format
A data frame with variables:
- patient_id
Integer. Patient identifier, serves as a foreign key to link with other patient-level data. Original field name: Alias in the key file, mapped from lopnr
- latest_personal_id
Logical. Flag indicating if this is the latest personal ID number for the individual. Original field name: SenPNr
- reused_personal_id
Logical. Flag indicating if this personal ID number has been reused. Original field name: AterPNr
- coordination_number
Logical. Flag indicating if this is a coordination number rather than a standard personal ID number. Used for individuals who don't have a permanent personal ID number. Original field name: SamOrdnNr
- incorrect_personal_id
Logical. Flag indicating if there are known errors with the ID. Original field name: FelPersonnr
- deso_code
Character. DeSO (Demographic Statistical Areas) code identifying the geographic area where the patient resides. Original field name: DeSO
- year
Integer. The year that the residential information pertains to, extracted from the original filename.
Details
This file combines data from multiple yearly SCB files about patients' residential areas according to the DeSO classification system. DeSO (Demografiska statistikområden) is a geographic subdivision introduced by Statistics Sweden in 2018 to enable statistical analysis at a detailed geographic level.
The original files contained residential data for each year from 2012 to 2018. This processed dataset combines all years, adding a year column to distinguish between different time points.
The patient identifiers in the SCB files (lopnr) are mapped to the standard patient_id used in the SEM cohort study using a key file (SCB_Ekelund_LEV_Nyckel.csv), ensuring consistent patient identification across all datasets.
Note
DeSO codes typically have the format of a 4-digit municipality code followed by an alphanumeric identifier (e.g., "1280C2650")
Almost all records have latest_personal_id = TRUE, indicating that most IDs are current
A small percentage (<0.1%) have reused_personal_id = TRUE, indicating potential ID reuse situations
coordination_number and incorrect_personal_id are FALSE for all records in the dataset but are retained for completeness