Structural Equation Modeling | Exercises 4
1 What we are going to cover
- Ex.1 – Non-normal continuous data
- Ex.2 – Categorical data multi-group
2 Data
The data set used throughout is the European Social Survey ESS4-2008 Edition 4.5 was released on 1 December 2018. We will restrict the analysis to the Belgian case. Each line in the data set represents a Belgian respondent. The full dataset an documentation can be found on the ESS website
Codebook:
gvslvol Standard of living for the old, governments’ responsibility (0 Not governments’ responsibility at all - 10 Entirely governments’ responsibility)
gvslvue Standard of living for the unemployed, governments’ responsibility (0 Not governments’ responsibility at all - 10 Entirely governments’ responsibility)
gvhlthc Health care for the sick, governments’ responsibility (0 Not governments’ responsibility at all - 10 Entirely governments’ responsibility)
gvcldcr Child care services for working parents, governments’ responsibility (0 Not governments’ responsibility at all - 10 Entirely governments’ responsibility)
gvjbevn Job for everyone, governments’ responsibility (0 Not governments’ responsibility at all - 10 Entirely governments’ responsibility)
gvpdlwk Paid leave from work to care for sick family, governments’ responsibility (0 Not governments’ responsibility at all - 10 Entirely governments’ responsibility)
sbstrec Social benefits/services place too great strain on economy (1 Agree strongly - 5 Disagree strongly)
sbbsntx Social benefits/services cost businesses too much in taxes/charges (1 Agree strongly - 5 Disagree strongly)
sbprvpv Social benefits/services prevent widespread poverty (1 Agree strongly - 5 Disagree strongly)
sbeqsoc Social benefits/services lead to a more equal society (1 Agree strongly - 5 Disagree strongly)
sbcwkfm Social benefits/services make it easier to combine work and family (1 Agree strongly - 5 Disagree strongly)
sblazy Social benefits/services make people lazy (1 Agree strongly - 5 Disagree strongly)
sblwcoa Social benefits/services make people less willing care for one another (1 Agree strongly - 5 Disagree strongly)
sblwlka Social benefits/services make people less willing look after themselves/family (1 Agree strongly - 5 Disagree strongly)
agea Respondent’s age
eduyrs Years of full-time education completed
gndr Gender (1 Male, 2 Female)
hinctnta Household’s total net income, all sources (Deciles of the actual household income range in the given country.)
gincdif Government should reduce differences in income levels (1 Agree strongly - 5 Disagree strongly)
dfincac Large differences in income acceptable to reward talents and efforts (1 Agree strongly - 5 Disagree strongly)
smdfslv For fair society, differences in standard of living should be small (1 Agree strongly - 5 Disagree strongly)
3 Environment preparation
First, let’s load the necessary packages to load, manipulate, visualize and analyse the data.
# Uncomment this once if you need to install the packages on your system
### DATA MANIPULATION ###
# install.packages("haven") # data import from spss
# install.packages("dplyr") # data manipulation
# install.packages("psych") # descriptives
# install.packages("stringr") # string manipulation
# install.packages("purrr") # table manipulation
### MODELING ###
#### For MVN we need install.packages("gsl").
#### On mac with homebrewed packaged:
#### CFLAGS="-I/opt/homebrew/Cellar/gsl/2.7.1/include" LDFLAGS="-L/opt/homebrew/Cellar/gsl/2.7.1/lib -lgsl -lgslcblas" R
#### install.packages("gsl")
# install.packages("lavaan") # SEM modelling
# install.packages("MVN") # tests for multivariate normality
# install.packages("Amelia") # performing multiple imputation
### VISUALIZATION ###
# install.packages("tidySEM") # plotting SEM models
# install.packages("ggplot2") # plotting
# install.packages("patchwork") # organization plots
# Load the packages
### DATA MANIPULATION ###
library("haven")
library("dplyr")
library("psych")
library("stringr")
library("purrr")
### MODELING ###
library("lavaan")
library("MVN")
library("Amelia")
### VISUALIZATION ###
library("tidySEM")
library("ggplot2")
library("patchwork")
4 Ex.1 – Non-normal continuous data
- Check the multivariate normality for the question gincdif, dfincac, smdfslv (egalitarianism) and gvslvol gvhlthc gvcldcr gvpdlwk (welfare support) .
- Estimate a simple CFA (2 latent variable), without any error covariances for egalitarianism (gincdif, dfincac, smdfslv) and welfare support (gvslvol gvhlthc gvcldcr gvpdlwk)
- Estimate the model using the ML estimation, then re-estimate using MLM.
- Compare fit statistics, loadings, and standard errors.
- Remove the covariance between egalitarianism and welfare support and re-estimate the model. Compare the fit (OPTIONAL).
<- haven::read_sav("https://github.com/albertostefanelli/SEM_labs/raw/master/data/ESS4_belgium.sav")
ess_df
# select the egalitarianism and welfare support items
<- ess_df[,c("gincdif", "dfincac", "smdfslv", "gvslvol", "gvhlthc", "gvcldcr","gvpdlwk")]
ess_df_eg # remove NAs
<- na.omit(ess_df_eg)
ess_df_eg_na
<- mvn(
mvn_test
)
mvn_test
<- '
model_eg_ws ## Egalitarianism ##
egual =~ gincdif + dfincac + smdfslv
## Welfare support ##
ws =~ gvslvol + gvhlthc + gvcldcr + gvpdlwk
'
<- cfa(model_eg_ws, # model formula
fit_eg_ws_ml data = ess_df, # data frame
estimator = "ML" # select the estimator
)
summary(
<- cfa(
fit_eg_ws_mlr
)
summary()
<- table_results(
tidy_results_ml
)
<- table_results(
tidy_results_mlr
)
data.frame(Parameters = ,
"ML Model" = ,
"MLR Model" =
)
<- '
model_eg_ws_cov
'
<- cfa(
fit_eg_ws_mlr_cov
summary(fit_eg_ws_mlr_cov)
# let's compare the fit of the different models
<- function(lavobject) {
model_fit <- c("chisq", "df", "cfi", "tli", "rmsea", "rmsea.ci.lower", "rmsea.ci.upper", "rmsea.pvalue", "srmr")
vars return(fitmeasures(lavobject)[vars] %>% data.frame() %>% round(2) %>% t())
}
<-
table_fit
rownames(table_fit) <-
table_fit
anova(, )
5 Ex.2 – Measurement invariance with categorical data
- Re-estimate a CFA (2 latent variable) for egalitarianism (gincdif, dfincac, smdfslv) and welfare support (gvslvol gvhlthc gvcldcr gvpdlwk) treating the items as ordinal
- Test if the two latent constructs reach scalar invariance between male and feamle (free/fixed factor loading and thresholds).
- Compare and interpret the fit.
- Even if not necessary, relax certain equality constraints, re-fit the model, compare the fit.
- Compare the metric model fitted with continuous data with the one fitted with ordinal data (OPTIONAL)
Lavaan syntax for relaxing equality constraints:
| c(label1,label1)*t1 # set threshold equal across 2 groups
y | c(label1,label2)*t1 # set threshold free across 2 groups
y ~*~ c(1,1)*u3 # fix scale of the endogenous variable "u3" to 1 in both groups u3
$gndr <- factor(
ess_df
# labels
)
<- '
model_eg_sc ## Egalitarianism ##
egual =~ gincdif + dfincac + smdfslv
## Social Criticism ##
wc_socia =~ sbprvpv + sbeqsoc + sbcwkfm
'
<- cfa(,
fit_eg_sc_confordered = c(
)
<- cfa(
fit_eg_sc_metric
)
<- cfa(
fit_eg_sc_scalar
)
anova(fit_eg_sc_conf,fit_eg_sc_metric,fit_eg_sc_scalar)
<- '
model_eg_sc_p
'
<- cfa(
fit_eg_sc_metric_p group.equal=c(),
# relax the threshold t1 of the variable dfincac
group.partial=c())
# we do not reach partial invariance
anova( , , )
<- cfa(
fit_eg_sc_metric_c
)
<-
table_fit
rownames(table_fit) <-
table_fit