SCB LISA Socioeconomic Dataset (2012-2018)
scb_lisa.Rd
This dataset contains socioeconomic and demographic information for patients from Statistics Sweden's Longitudinal Integration Database for Health Insurance and Labour Market Studies (LISA). The data spans from 2012 to 2018 and includes information about education, employment, income, benefits, and family structure.
Format
A data frame with the following variables:
- patient_id
Integer. Patient identifier, serves as a foreign key to link with other patient-level data. Original field name: Alias in the key file, mapped from lopnr
- year
Integer. The year that the data pertains to (2012-2018).
- latest_personal_id
Logical. Flag indicating if this is the latest personal ID number for the individual. Original field name: SenPNr
- reused_personal_id
Logical. Flag indicating if this personal ID number has been reused. Original field name: AterPNr
- coordination_number
Logical. Flag indicating if this is a coordination number rather than a standard personal ID number. Original field name: SamOrdnNr
- incorrect_personal_id
Logical. Flag indicating if there are known errors with the ID. Original field name: FelPersonnr
- birth_year
Integer. Year of birth. Original field name: FodelseAr
- gender
Integer. Gender code: 1=male, 2=female. Original field name: Kon
- county
Character. County code. Original field name: Lan
- municipality
Character. Municipality code. Original field name: Kommun
- parish
Character. Parish code. Original field name: Forsamling
- civil_status
Character. Civil status code (G=married, OG=unmarried, S=divorced, etc.) Original field name: Civil
- citizenship_group
Character. Citizenship group classification. Original field name: MedbGrEg
- citizenship_group_detailed
Character. Detailed citizenship group classification. Original field name: MedbGrEg4
- children_0_3
Integer. Number of children aged 0-3 years in the household. Original field name: Barn0_3
- children_4_6
Integer. Number of children aged 4-6 years in the household. Original field name: Barn4_6
- children_7_10
Integer. Number of children aged 7-10 years in the household. Original field name: Barn7_10
- children_11_15
Integer. Number of children aged 11-15 years in the household. Original field name: Barn11_15
- children_16_17
Integer. Number of children aged 16-17 years in the household. Original field name: Barn16_17
- education_level_old
Character. Education level classification (1995-2018 definition). Original field name: Sun2000niva_old
- education_level
Numeric. Education level code (3 positions, 1991-2019 definition). Original field name: Sun2000niva
- education_field
Character. Field of education code (Sun2000 nomenclature). Original field name: Sun2000Inr
- graduation_year
Character. Year of highest completed education. Original field name: ExamAr
- employment_status
Numeric. Employment status code. Original field name: SyssStat11
- occupational_position
Numeric. Occupational position code. Original field name: YrkStalln
- occupation_code
Character. Detailed occupation code (SSYK). Original field name: Ssyk4_J16
- occupation_year
Character. Year when occupation data was recorded. Original field name: SsykAr_J16
- occupation_status
Numeric. Code indicating correspondence between the occupation code and the organization number. Original field name: SsykStatus_J16
- socioeconomic_group
Character. Occupation-based socioeconomic grouping. Original field name: YSEG
- parental_leave
Numeric. Parental leave benefits (amount in SEK). Original field name: ForLed
- sickness_benefits
Numeric. Sickness benefits (amount in SEK). Original field name: SjukPP
- sickness_type
Logical. Presence of sickness/occupational injury compensation. Original field name: SjukTyp
- rehabilitation_benefits
Numeric. Rehabilitation benefits (amount in SEK). Original field name: SjukRe
- sickness_days
Numeric. Number of days with sickness benefits (from MiDAS database). Original field name: SjukP_Ndag_MiDAS
- unemployment_benefits
Numeric. Unemployment benefits (amount in SEK). Original field name: ArbLos
- unemployment_days
Numeric. Number of days with unemployment benefits. Original field name: ALosDag
- labor_market_support
Numeric. Total income resulting from labor market policy measures (amount in SEK). Original field name: AmPol
- disability_pension
Numeric. Disability pension/early retirement (amount in SEK). Original field name: ForTid
- disability_pension_type
Logical. Presence of early retirement/sickness benefits/sickness compensation/activity support. Original field name: ForTidTyp
- sickness_compensation_days
Numeric. Number of days with sickness compensation. Original field name: SjukErs_Ndag_MiDAS
- capital_income
Numeric. Income from capital (amount in SEK). Original field name: KapInk
- pension_income
Numeric. Pension income (amount in SEK). Original field name: SumAldP03
- social_assistance
Numeric. Social assistance for family (amount in SEK). Original field name: SocBidrFam
- housing_allowance
Numeric. Housing allowance for family (amount in SEK). Original field name: BostBidrFam
- disposable_income
Numeric. Disposable income per consumption unit, family (amount in SEK). Original field name: DispInkKE
- disposable_income_ke04
Numeric. Disposable income per consumption unit, family, 2004 definition (amount in SEK). Original field name: DispInkKE04
- disposable_income_04
Numeric. Disposable income 04 (amount in SEK). Original field name: DispInk04
Details
This dataset combines yearly files from Statistics Sweden's LISA database for the years 2012-2018. LISA (Longitudinal Integration Database for Health Insurance and Labour Market Studies) is a comprehensive database that integrates information from various registers to provide detailed socioeconomic data.
The data includes demographic information, education, employment, income sources, and various social benefits. This provides a comprehensive picture of patients' socioeconomic circumstances over time.
The patient identifiers in the SCB files (lopnr) are mapped to the standard patient_id used in the SEM cohort study using a key file (SCB_Ekelund_LEV_Nyckel.csv), ensuring consistent patient identification across all datasets.
Note
Several variables have coding systems that may require reference to SCB documentation for complete interpretation, including:
education_level (Sun2000niva) - Swedish educational nomenclature
education_field (Sun2000Inr) - Field of education codes
occupation_code (Ssyk4_J16) - Swedish Standard Classification of Occupations
socioeconomic_group (YSEG) - Socioeconomic classification
For monetary values (benefits, income), the unit is Swedish Krona (SEK)
The different disposable income measures represent different calculation methods:
disposable_income (DispInkKE): Disposable income per consumption unit, family
disposable_income_ke04 (DispInkKE04): Disposable income per consumption unit, family, using 2004 definition
Missing values for occupation-related variables may indicate unemployment or being outside the labor force
Value "" or "***" in character fields often indicates missing or not applicable data