Intro to Conjoint Experiments | Lab - 5

1 Outline

  1. Interactive models
    • Between Attributes
    • Subgroups Analysis (AMCE and MM)
  2. Constrained design
  3. Non-uniform marginal distribution
    • Design-based approaches
    • Model-based approaches
  4. Power Analysis using simulations
  5. Mixture models
    • Treatment heterogeneity

2 Dataset used

  1. 2019 CEU Experimental Political Science dataset
    • Q578 Q579 Q580: Choice CJ Task
    • F-*-*: Conjoint features
    • ResponseId: id
    • Q78: Employment status
    • Q77: Race
    • Q76: Education
    • Q75: Gender
    • Q74: Age
    • Q581: Religiosity
  2. Immigration experiment from Hainmueller, Hopkins, and Yamamoto (2014)
  3. Partisan labels experiment from Kirkland and Coppock (2018)

3 Environment preparation

# ### Data import ###
# install.packages("readr")      # read datasets
# install.packages("qualtRics")  # read qualtrics datasets
# ### Data manipulation ###
# install.packages("dplyr")      # pipes and data manipulation
# install.packages("stringr")     # text manipulation
# ### Visualization ###
# install.packages("ggplot2")    # graphing capabilities
# install.packages("ggimage")    # integrating images in ggplot
# ### Estimation ###
# install.packages("cjoint")     # base amce package
# install.packages("cregg")      # amce and mm 
# install.packages("factorEx")   # amce with non-uniform distribution
# install.packages("flexmix")    # mixture models 

## Custom build functions 
# library(devtools)
# devtools::install_github("albertostefanelli/cjoint") # fixes some problem with cjoint
# devtools::install_github("naoki-egami/factorEx", dependencies=TRUE) # amce with non-uniform distribution
# devtools::install_github("albertostefanelli/cjsimPWR") # power analysis using simulation


### Data import ###
library("readr")     
library("qualtRics") 
### Data manipulation ###
library("dplyr")     
library("stringr")     
### Visualization ###
library("ggplot2")    
library("ggimage")
### Estimation ###
library("cjoint")   
library("cregg")     
library("factorEx") 
library("flexmix")
library("cjsimPWR")

4 Load the data: Non-Qualtrics data

df_base <- readr::read_csv("https://github.com/albertostefanelli/conjoint_class/raw/master/data/experimental_political_science_2019_cleaned.csv")

head(df_base)
# A tibble: 6 × 48
   ...1 `Duration (in seconds)` Finished ResponseId      Q578  Q579  Q580  Q271_1 Q271_2 Q271_3 Q271_4 Q271_5 Q271_6 Q78   Q77   Q76   Q75     Q74 Q581  `F-1-1` `F-1-1-1` `F-1-2` `F-1-1-2` `F-1-3` `F-1-1-3` `F-1-2-1` `F-1-2-2` `F-1-2-3` `F-2-1` `F-2-1-1` `F-2-2` `F-2-1-2` `F-2-3` `F-2-1-3` `F-2-2-1`
  <dbl>                   <dbl> <lgl>    <chr>           <chr> <chr> <chr> <chr>  <chr>  <chr>  <chr>  <chr>  <chr>  <chr> <chr> <chr> <chr> <dbl> <chr> <chr>   <chr>     <chr>   <chr>     <chr>   <chr>     <chr>     <chr>     <chr>     <chr>   <chr>     <chr>   <chr>     <chr>   <chr>     <chr>    
1     1                     295 TRUE     R_3QMq6EzoaEhT… Cand… Cand… Cand… 6      6      6      6      3      5      Stud… Cauc… Some… Male     22 Never Past P… None      Policy… Welcome … Penal … The cand… Approxim… Acceptan… No proce… Past P… Approxim… Policy… Acceptan… Penal … The cand… Approxim…
2     2                     358 TRUE     R_2ZWYdunD4yrj… Cand… Cand… Cand… 6      3      3      6      1 - S… 7 - S… Unem… Cauc… Some… Male     32 Never Past P… Approxim… Penal … No proce… Policy… Welcome … Approxim… The cand… Acceptan… Past P… Approxim… Penal … No proce… Policy… Acceptan… Approxim…
3     3                     310 TRUE     R_3GEedrAF3Om8… Cand… Cand… Cand… 6      3      6      5      3      6      Stud… Asian Some… Male     21 Never Penal … The cand… Past P… None      Policy… Acceptan… No proce… None      Acceptan… Penal … The cand… Past P… Approxim… Policy… Collecti… The cand…
4     4                     365 TRUE     R_emv0H4GnLSxq… Cand… Cand… Cand… 6      5      5      4 - N… 5      5      Empl… Cauc… Some… Male     30 Regu… Past P… None      Policy… Acceptan… Penal … The cand… None      Collecti… No proce… Past P… Approxim… Policy… Acceptan… Penal … No proce… None     
5     5                     364 TRUE     R_1CIruyJc7nV1… Cand… Cand… Cand… 5      4 - N… 5      5      6      5      Empl… Cauc… Bach… Male     30 Regu… Policy… Welcome … Past P… Approxim… Penal … The cand… Welcome … Approxim… The cand… Policy… Welcome … Past P… None      Penal … The cand… Acceptan…
6     6                     439 TRUE     R_2TMFzzb17eua… Cand… Cand… Cand… 5      4 - N… 4 - N… 4 - N… 3      5      Empl… Hisp… Bach… Male     41 Some… Policy… Collecti… Penal … The cand… Past P… Approxim… Welcome … The cand… None      Policy… Acceptan… Penal … No proce… Past P… None      Welcome …
# ℹ 13 more variables: `F-2-2-2` <chr>, `F-2-2-3` <chr>, `F-3-1` <chr>, `F-3-1-1` <chr>, `F-3-2` <chr>, `F-3-1-2` <chr>, `F-3-3` <chr>, `F-3-1-3` <chr>, `F-3-2-1` <chr>, `F-3-2-2` <chr>, `F-3-2-3` <chr>, duration_mins <dbl>, scaled_duration_mins <dbl>
# transform to numeric
df_base$Q578 <- as.numeric(as.factor(df_base$Q578))
df_base$Q579 <- as.numeric(as.factor(df_base$Q579))
df_base$Q580 <- as.numeric(as.factor(df_base$Q580))

df_long_format <- read.df.qualtRics(df_base, 
  # the DV for our conjoint tasks (df_base 1 and Candidate 2)
  responses=c("Q578","Q579","Q580"), 
  #covariates=c("Q76"),
  respondentID = "ResponseId"
  )

5 Interaction between attributes (ACIE) and trade-offs

Our expectation is that the effect of being accused of corruption is conditional on the time a politician has spent in politics.

fit_1 <- cjoint::amce(selected ~ 
  Past.Political.Experience + 
  Penal.proceedings + 
  Policy.Proposal +
  Penal.proceedings * Past.Political.Experience,
  data=df_long_format,
  cluster=TRUE, 
  respondent.id="ResponseId"
)

summary(fit_1)
------------------------------------------
Average Marginal Component Effects (AMCE):
------------------------------------------
                 Attribute                                                        Level   Estimate Std. Err   z value                                  Pr(>|z|)    
 Past.Political.Experience                                       Approximately 20 years  0.0096664 0.034141   0.28313 0.777075131376317118636620762117672711611    
 Past.Political.Experience                                                         None -0.1279208 0.030551  -4.18718 0.000028244150852242063960261669874896029 ***
         Penal.proceedings               The candidate has been convicted of corruption -0.4107275 0.032990 -12.44993 0.000000000000000000000000000000000013998 ***
         Penal.proceedings          The candidate is under investigation for corruption -0.2957963 0.034377  -8.60452 0.000000000000000007663632871421139651151 ***
           Policy.Proposal Collective expulsion of immigrants and closure of the border -0.2743198 0.038022  -7.21486 0.000000000000539910104817876547303644823 ***
           Policy.Proposal               Welcome immigrants and organise human corridor -0.0751002 0.038872  -1.93200 0.053359578223767793747622789624074357562    
---
Number of Obs. = 1110
---
Number of Respondents = 185
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05

--------------------
AMCE Baseline Levels:
--------------------
                 Attribute                                                      Level
 Past.Political.Experience                                     Approximately 10 years
         Penal.proceedings                                             No proceedings
           Policy.Proposal Acceptance of immigrants conditional on certain requisites


---------------------------------------------
Average Component Interaction Effects (ACIE):
---------------------------------------------
                                   Attribute                                                                      Level  Estimate Std. Err  z value Pr(>|z|)  
 Past.Political.Experience:Penal.proceedings      Approximately 20 years:The candidate has been convicted of corruption -0.161383 0.077751 -2.07564 0.037927 *
 Past.Political.Experience:Penal.proceedings                        None:The candidate has been convicted of corruption -0.066669 0.082882 -0.80438 0.421178  
 Past.Political.Experience:Penal.proceedings Approximately 20 years:The candidate is under investigation for corruption -0.089770 0.081424 -1.10251 0.270240  
 Past.Political.Experience:Penal.proceedings                   None:The candidate is under investigation for corruption -0.112070 0.083351 -1.34456 0.178768  
---
Number of Obs. = 1110
Number of Respondents = 185
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05

--------------------
ACIE Baseline Levels:
--------------------
                                   Attribute                                 Level
 Past.Political.Experience:Penal.proceedings Approximately 10 years:No proceedings

5.1 ACIE: Least liked profile

  • Sometimes it is interesting to assess the combination of attributes that are the least liked by the respondents
  • This is might be relevant in some context (e.g. new policy)
  • To achieve this, we are going to fit a full saturated model
    • Meaning, we fit a model with all the highest interactions possible
    • In this case, we fit a model with all the 3rd order interactions
fit_2 <- cjoint::amce(selected ~ 
  Past.Political.Experience*Penal.proceedings*Policy.Proposal,
  data=df_long_format,
  cluster=TRUE, 
  respondent.id="ResponseId"
)

summary(fit_2)
------------------------------------------
Average Marginal Component Effects (AMCE):
------------------------------------------
                 Attribute                                                        Level  Estimate Std. Err   z value                                 Pr(>|z|)    
 Past.Political.Experience                                       Approximately 20 years  0.013514 0.034073   0.39661 0.69165537662429610321623840718530118465    
 Past.Political.Experience                                                         None -0.125044 0.031668  -3.94863 0.00007859878292662227832268695415507409 ***
         Penal.proceedings               The candidate has been convicted of corruption -0.402841 0.032881 -12.25161 0.00000000000000000000000000000000016469 ***
         Penal.proceedings          The candidate is under investigation for corruption -0.288860 0.034657  -8.33494 0.00000000000000007753662744227048302961 ***
           Policy.Proposal Collective expulsion of immigrants and closure of the border -0.280505 0.037839  -7.41315 0.00000000000012333536218225874866920267 ***
           Policy.Proposal               Welcome immigrants and organise human corridor -0.077665 0.039118  -1.98539 0.04710063098324948627215746910223970190   *
---
Number of Obs. = 1110
---
Number of Respondents = 185
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05

--------------------
AMCE Baseline Levels:
--------------------
                 Attribute                                                      Level
 Past.Political.Experience                                     Approximately 10 years
         Penal.proceedings                                             No proceedings
           Policy.Proposal Acceptance of immigrants conditional on certain requisites


---------------------------------------------
Average Component Interaction Effects (ACIE):
---------------------------------------------
                                                   Attribute                                                                                                                                   Level   Estimate Std. Err  z value  Pr(>|z|)   
                 Past.Political.Experience:Penal.proceedings                                                                   Approximately 20 years:The candidate has been convicted of corruption -0.1689454 0.077037 -2.19303 0.0283051  *
                 Past.Political.Experience:Penal.proceedings                                                                                     None:The candidate has been convicted of corruption -0.0796609 0.084073 -0.94752 0.3433740   
                 Past.Political.Experience:Penal.proceedings                                                              Approximately 20 years:The candidate is under investigation for corruption -0.1025972 0.080967 -1.26714 0.2051040   
                 Past.Political.Experience:Penal.proceedings                                                                                None:The candidate is under investigation for corruption -0.1279843 0.083486 -1.53300 0.1252759   
                   Past.Political.Experience:Policy.Proposal                                                     Approximately 20 years:Collective expulsion of immigrants and closure of the border -0.0655741 0.088543 -0.74059 0.4589435   
                   Past.Political.Experience:Policy.Proposal                                                                       None:Collective expulsion of immigrants and closure of the border  0.0088892 0.081457  0.10913 0.9131019   
                   Past.Political.Experience:Policy.Proposal                                                                   Approximately 20 years:Welcome immigrants and organise human corridor -0.0644945 0.087605 -0.73620 0.4616113   
                   Past.Political.Experience:Policy.Proposal                                                                                     None:Welcome immigrants and organise human corridor  0.0930505 0.088700  1.04904 0.2941581   
                           Penal.proceedings:Policy.Proposal                             The candidate has been convicted of corruption:Collective expulsion of immigrants and closure of the border  0.0734913 0.071416  1.02905 0.3034552   
                           Penal.proceedings:Policy.Proposal                        The candidate is under investigation for corruption:Collective expulsion of immigrants and closure of the border  0.0433472 0.077717  0.55775 0.5770117   
                           Penal.proceedings:Policy.Proposal                                           The candidate has been convicted of corruption:Welcome immigrants and organise human corridor -0.0856311 0.081183 -1.05479 0.2915212   
                           Penal.proceedings:Policy.Proposal                                      The candidate is under investigation for corruption:Welcome immigrants and organise human corridor -0.0452215 0.075772 -0.59681 0.5506367   
 Past.Political.Experience:Penal.proceedings:Policy.Proposal      Approximately 20 years:The candidate has been convicted of corruption:Collective expulsion of immigrants and closure of the border -0.5391220 0.202312 -2.66481 0.0077032 **
 Past.Political.Experience:Penal.proceedings:Policy.Proposal                        None:The candidate has been convicted of corruption:Collective expulsion of immigrants and closure of the border -0.0840575 0.217443 -0.38657 0.6990721   
 Past.Political.Experience:Penal.proceedings:Policy.Proposal Approximately 20 years:The candidate is under investigation for corruption:Collective expulsion of immigrants and closure of the border -0.3888759 0.206953 -1.87906 0.0602370   
 Past.Political.Experience:Penal.proceedings:Policy.Proposal                   None:The candidate is under investigation for corruption:Collective expulsion of immigrants and closure of the border -0.0770578 0.230378 -0.33448 0.7380138   
 Past.Political.Experience:Penal.proceedings:Policy.Proposal                    Approximately 20 years:The candidate has been convicted of corruption:Welcome immigrants and organise human corridor  0.0242687 0.204120  0.11889 0.9053590   
 Past.Political.Experience:Penal.proceedings:Policy.Proposal                                      None:The candidate has been convicted of corruption:Welcome immigrants and organise human corridor  0.2053987 0.206518  0.99458 0.3199412   
 Past.Political.Experience:Penal.proceedings:Policy.Proposal               Approximately 20 years:The candidate is under investigation for corruption:Welcome immigrants and organise human corridor  0.2400620 0.197758  1.21392 0.2247785   
 Past.Political.Experience:Penal.proceedings:Policy.Proposal                                 None:The candidate is under investigation for corruption:Welcome immigrants and organise human corridor  0.1870717 0.202510  0.92377 0.3556086   
---
Number of Obs. = 1110
Number of Respondents = 185
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05

--------------------
ACIE Baseline Levels:
--------------------
                                                   Attribute                                                                                            Level
                 Past.Political.Experience:Penal.proceedings                                                            Approximately 10 years:No proceedings
                   Past.Political.Experience:Policy.Proposal                Approximately 10 years:Acceptance of immigrants conditional on certain requisites
                           Penal.proceedings:Policy.Proposal                        No proceedings:Acceptance of immigrants conditional on certain requisites
 Past.Political.Experience:Penal.proceedings:Policy.Proposal Approximately 10 years:No proceedings:Acceptance of immigrants conditional on certain requisites
  • In this case, the last liked profile is the a high experience candidate that has been convicted and want to expel all the migrants and shut down the border
  • The interpretation needs to be causal so we need to flip the previous statement and focus on the attributes
  • Q: How can the “least liked” profile be casually interpreted?

6 Subgroup analysis (conditional AMCEs)

  • Expectation: minority respondents display group interest such as they will be more in favour of accepting migrants
  • Prepare the data
    1. Transform race into factor
    2. Use a reference category Caucasian
    3. Wide to Long
    4. Omit NAs
df_base$Q77 <- factor(df_base$Q77)
df_base$Q77 <- relevel(df_base$Q77, "Caucasian")

df_long_format <- read.df.qualtRics(df_base, 
  responses=c("Q578","Q579","Q580"), 
  covariates=c("Q77"),
  respondentID = "ResponseId"
  )

# interactions work ONLY with compete observations
df_long_format <- na.omit(df_long_format)

6.1 Conditional AMCE

  • Run the AMCE with the interaction between each Attribute and the race of the respondent
fit_3 <-cregg::cj(df_long_format, selected ~ 
  Past.Political.Experience + 
  Penal.proceedings + 
  Policy.Proposal +
  Penal.proceedings, 
  by = ~Q77,
  id = ~ResponseId,
  estimate = "amce")
fit_3_diff <-cregg::cj(df_long_format, selected ~ 
  Past.Political.Experience + 
  Penal.proceedings + 
  Policy.Proposal +
  Penal.proceedings, 
  by = ~Q77,
  id = ~ResponseId,
  estimate = "amce_diff")
plot(rbind(fit_3, fit_3_diff)) + ggplot2::facet_wrap(~BY)

  • Some minority groups seem to display preference that support my hypothesis
  • There is reference heterogeneity between respondents

6.2 Marginal Means

But again, this plot showcases differences in conjoint effect sizes (AMCEs) not descriptive differences in underlying preferences. A plot of the differences in MMs might be better

fit_4 <-cregg::cj(df_long_format, selected ~ 
  Past.Political.Experience + 
  Penal.proceedings + 
  Policy.Proposal +
  Penal.proceedings, 
  by = ~Q77,
  id = ~ResponseId,
  estimate = "mm")

fit_4_diff <-cregg::cj(df_long_format, selected ~ 
  Past.Political.Experience + 
  Penal.proceedings + 
  Policy.Proposal +
  Penal.proceedings, 
  by = ~Q77,
  id = ~ResponseId,
  estimate = "mm_diff")


plot(rbind(fit_4, fit_4_diff)) + ggplot2::facet_wrap(~BY, ncol = 3L)

And while the inferential differences may be small, the risk of using differences in conditional AMCEs versus differences in MMs is that both the size and even the direction of subgroup differences can be misleading when presented as differences in AMCEs

fit_3_diff$Estimate <- "AMCE"
fit_4_diff$Estimate <- "MM"

plot(rbind(fit_3_diff, fit_4_diff)) + ggplot2::facet_wrap(~Estimate + BY, ncol = 4L)

6.3 F-Test

  • We are going to perform an F-Test (also called Omnibus test) to formally assess if there is no interactive effect between Attributes and Education
  • We compared a model with interaction and one without interaction to assess whether preferences significantly vary across different groups.
    • If the significance test is <0.05, the data suggest that there is a significance difference between groups
    • If the significance test is >0.05, the data suggest that the two groups show no difference when one of the attribute is changed.
  • 3 steps
    • Estimate the base line model without interactions
    • Estimate the interaction model
    • Perform a F-Test between the baseline model and the interaction model
    • with cregg, we can do it in 1 function
cregg::cj_anova(df_long_format, selected ~ 
  Past.Political.Experience + 
  Penal.proceedings + 
  Policy.Proposal +
  Penal.proceedings, 
  by = ~Q77,
  id = ~ResponseId)
Analysis of Deviance Table

Model 1: selected ~ Past.Political.Experience + Penal.proceedings + Policy.Proposal + 
    Penal.proceedings
Model 2: selected ~ Past.Political.Experience + Penal.proceedings + Policy.Proposal + 
    Q77 + Past.Political.Experience:Q77 + Penal.proceedings:Q77 + 
    Policy.Proposal:Q77
  Resid. Df Resid. Dev Df Deviance      F Pr(>F)
1      1103     223.87                          
2      1075     217.63 28   6.2434 1.1014 0.3273

7 Constrained Designs

  • Immigration experiment from Hainmueller, Hopkins, and Yamamoto (2014)
    • Load the data
    • Specify the design with the makeDesign() function
    • Run AMCE for unconstrained design
    • Run AMCE for constrained design
data("immigrationconjoint")
data("immigrationdesign")

## You can also load a design from a .dat file from the Conjoint SDT
#immigrationdesign <- makeDesign(type="file", filename="immigrant.dat")

fit_unconstrained <- cjoint::amce(Chosen_Immigrant ~  Gender + Education + `Language Skills`  +
                    `Country of Origin` + Job + `Job Experience` + `Job Plans`  +
                    `Reason for Application` + `Prior Entry`,
                data=immigrationconjoint,
                cluster=TRUE, 
                respondent.id="CaseID",
                design="uniform"
                )


fit_constrained <- cjoint::amce(Chosen_Immigrant ~  Gender + Education + `Language Skills`  +
                    `Country of Origin` + Job + `Job Experience` + `Job Plans`  +
                    `Reason for Application` + `Prior Entry`,
                data=immigrationconjoint,
                cluster=TRUE, 
                respondent.id="CaseID",
                design=immigrationdesign
                )

# let's check the estimate for "Computer Programmer"
amce_unconstrained <- summary(fit_unconstrained)$amce$Estimate[23]
amce_constrained <- summary(fit_constrained)$amce$Estimate[23]

cbind(amce_unconstrained,amce_constrained, round((amce_unconstrained-amce_constrained)*100,2))
     amce_unconstrained amce_constrained      
[1,]         0.05941493       0.07910121 -1.97

8 Non-uniform marginal distribution

We can also introduce non-uniform marginal distribution to improuve the external validity of our experiment. Let’s first fit a baseline model that uses an uniform distribution

# this function calculate the frequencies of our constrained design 
freq_cj <- cregg::cj_freqs(df_long_format,
  selected ~ 
  Past.Political.Experience + 
  Penal.proceedings + 
  Policy.Proposal +
  Penal.proceedings,
  id = ~ResponseId
  )


# this function transform the frequencies into proportion to be included in the _pAMCE() function

function_marginal_distribution <- function(freq_cregg=freq_cregg){

data <- freq_cregg %>% 
  group_by(feature) %>% 
  mutate(prop=estimate/sum(estimate))

marginal_f <- c()
for (f in unique(data$feature)){
  marginal_l <- c()
  subset_freq_cj_props <- data %>% subset(feature==f) 
  subset_freq_cj_props$level <- factor(subset_freq_cj_props$level)
  
  for (l in subset_freq_cj_props$level){
    subset_freq_cj_props_filtered <- subset_freq_cj_props %>% filter(level==l)
    marginal_l[l] <- subset_freq_cj_props_filtered$prop
  }
  marginal_f[[f]] <- marginal_l
}
return(marginal_f)
}

# call the function 
non_uniform_distribution  <- function_marginal_distribution(freq_cregg=freq_cj)

non_uniform_distribution
$Past.Political.Experience
Approximately 10 years Approximately 20 years                   None 
             0.3270270              0.3324324              0.3405405 

$Penal.proceedings
                                     No proceedings      The candidate has been convicted of corruption The candidate is under investigation for corruption 
                                          0.3531532                                           0.3324324                                           0.3144144 

$Policy.Proposal
  Acceptance of immigrants conditional on certain requisites Collective expulsion of immigrants and closure of the border               Welcome immigrants and organise human corridor 
                                                   0.3333333                                                    0.3207207                                                    0.3459459 

Let’s modify the distribution using something that makes more sense in the context of this experiment. Specificaly, we want to specify a different marginal distributionfor the attribute Political Experience and Pena Proceedings.

non_uniform_distribution$Past.Political.Experience <- c("None" = 0.01, 
                                                        "Approximately 10 years" =  0.69, 
                                                        "Approximately 20 years" = 0.3)


non_uniform_distribution$Penal.proceedings <- c("No proceedings" = 0.9, 
                                                        "The candidate has been convicted of corruption" =  0.05, 
                                                        "The candidate is under investigation for corruption" = 0.05)


non_uniform_distribution
$Past.Political.Experience
                  None Approximately 10 years Approximately 20 years 
                  0.01                   0.69                   0.30 

$Penal.proceedings
                                     No proceedings      The candidate has been convicted of corruption The candidate is under investigation for corruption 
                                               0.90                                                0.05                                                0.05 

$Policy.Proposal
  Acceptance of immigrants conditional on certain requisites Collective expulsion of immigrants and closure of the border               Welcome immigrants and organise human corridor 
                                                   0.3333333                                                    0.3207207                                                    0.3459459 
# should sum to 1  
sum(non_uniform_distribution$Past.Political.Experience)
[1] 1
  • Integrate the new marginal distribution in the estimation
  • We are using the function design_pAMCE() from the package factorEx (de la Cuesta, Egami, and Imai 2022)
  • model_pAMCE() arguments
    • formula: same as before
    • df: same as before
    • id: same as before
    • target_dist: the non_uniform_distribution from function_marginal_distribution()
    • target_type: marginal (but also join and partial)
model_marginal <- factorEx::model_pAMCE(
  formula = selected ~ 
  Past.Political.Experience + 
  Penal.proceedings + 
  Policy.Proposal +
  Penal.proceedings,
  reg = FALSE,
  data = df_long_format,
  cluster_id = df_long_format$ResponseId,
  target_dist = non_uniform_distribution, 
  target_type = "marginal",
  boot= 500, 
  numCores = 4, 
)

summary(model_marginal)

----------------
Population AMCEs:
----------------
 target_dist                    factor                                                        level    Estimate Std. Error p value    
    target_1 Past.Political.Experience                                       Approximately 20 years  0.08588296 0.04409969   0.051   .
    target_1 Past.Political.Experience                                                         None -0.07412879 0.04805617   0.123    
    target_1         Penal.proceedings               The candidate has been convicted of corruption -0.38264193 0.04488193   0.000 ***
    target_1         Penal.proceedings          The candidate is under investigation for corruption -0.25671930 0.04630749   0.000 ***
    target_1           Policy.Proposal Collective expulsion of immigrants and closure of the border -0.30004056 0.06150532   0.000 ***
    target_1           Policy.Proposal               Welcome immigrants and organise human corridor -0.05917951 0.05411017   0.274    
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
adjusted_marginal <- model_marginal$AMCE$Past.Political.Experience %>% filter(type=="target_1")  %>% dplyr::select(estimate)
unadjusted_marginal <- model_marginal$AMCE$Past.Political.Experience %>% filter(type=="sample")  %>% dplyr::select(estimate)

cbind(adjusted_marginal, unadjusted_marginal, (adjusted_marginal-unadjusted_marginal)*100)
     estimate    estimate estimate
3  0.08588296  0.01421017 7.167279
4 -0.07412879 -0.12577195 5.164315

9 Power Analysis

The power analysis R-package cjsimPWR is not yet on CRAN and so you have to install the package directly for my GitHub. To do so, uncomment the following lines

# if(!require(devtools)) install.packages("devtools")
# library(devtools)
# devtools::install_github("albertostefanelli/cjsimPWR")

The current version of the power_sim() function can be utilized to calculate a priori power, Type S, and Type M error rates based on conjoint designs typically used in political science and neighboring disciplines. It can be used when the treatment effect (i.e., AMCE) is either (1) considered to be homogeneous across all respondents or (2) expected to differ substantially across different segments of the population.

Let’s start with an experiment with the following attributes with no subgroups (i.e., homogeneous treatment effects)

  • Number of attributes (n_attributes): 3
    • Number of levels Attribute 1 (n_levels): 2
    • Number of levels Attribute 2 (n_levels): 3
    • Number of levels Attribute 3 (n_levels): 5
  • Number of respondents (units): 500
  • Number of tasks per respondent (n_tasks): 5
  • Hypothesized AMCE compared to reference level (true_coef):
    • Attribute 1 - Level 1 (ref: 0): 0.20
    • Attribute 2 - Level 1 (ref: 0): -0.1
    • Attribute 2 - Level 2 (ref: 0): 0.10
    • Attribute 3 - Level 1 (ref: 0): -0.03
    • Attribute 3 - Level 2 (ref: 0) -0.1
    • Attribute 3 - Level 3 (ref: 0): -0.1
    • Attribute 3 - Level 4 (ref: 0): 0.1
  • Individual level heterogeneity (sigma.u_k): 0.05
  • Number of simulations (sim_runs): 100
  • Seed for reproducibility of the simulation results (seed) : 2114
df_power <- power_sim(
  n_attributes = 3,
  n_levels = c(2, 3, 5),
  units = 600,
  n_tasks = 5,
  true_coef = list(0.2, c(-0.1, 0.1), c(-0.03, -0.1, -0.1, 0.1)),
  sigma.u_k = 0.05,
  sim_runs = 100,
  seed = 2114
)

df_power |> print(n=40)
# A tibble: 7 × 7
  attrb level true_coef power       typeS       typeM        seed
  <chr> <dbl>     <dbl> <chr>       <chr>       <chr>       <dbl>
1 var_1     1      0.2  1.00 (0.00) 0.00 (0.00) 0.97 (0.07)  2114
2 var_2     1     -0.1  1.00 (0.00) 0.00 (0.00) 0.95 (0.17)  2114
3 var_2     2      0.1  1.00 (0.00) 0.00 (0.00) 1.02 (0.16)  2114
4 var_3     1     -0.03 0.33 (0.47) 0.00 (0.00) 1.89 (0.34)  2114
5 var_3     2     -0.1  1.00 (0.00) 0.00 (0.00) 0.96 (0.18)  2114
6 var_3     3     -0.1  1.00 (0.00) 0.00 (0.00) 0.96 (0.20)  2114
7 var_3     4      0.1  1.00 (0.00) 0.00 (0.00) 1.03 (0.19)  2114

Next, let’s calculate power for an experiment with a similar design but 3 subgroups of respondents (Democrat, Independent, Republican) with hypothesized differences in conditional AMCEs (i.e., heterogeneous treatment effects).

df_power_interaction <- power_sim(
          n_attributes = 3,
          n_levels = c(2, 3, 6),
          n_tasks = 4,
          group_name = c("Democrat", "Independent", "Republican"),
          units = c(500, 200, 500),
          true_coef = list("Democrat" = list(0.2, c(-0.1, 0.1), c(-0.1, -0.1, -0.1, 0.1, -0.03)),
                           "Independent" = list(0.1, c(-0.2, 0.05),  c(-0.1, 0.1, 0.1, 0.3, 0.01)),
                           "Republican" = list(0.1, c(-0.1, -0.0005),  c(-0.1, 0.2, -0.1, 0.1, -0.01))
                           ),
          sigma.u_k = 0.05,
          sim_runs = 100,
          seed = 12
)

df_power_interaction |> print(n=40)
# A tibble: 24 × 8
   id_grp      attrb level true_coef power       typeS       typeM          seed
   <chr>       <chr> <dbl>     <dbl> <chr>       <chr>       <chr>         <dbl>
 1 Democrat    var_1     1    0.2    1.00 (0.00) 0.00 (0.00) 0.97 (0.08)      12
 2 Democrat    var_2     1   -0.1    1.00 (0.00) 0.00 (0.00) 0.95 (0.23)      12
 3 Democrat    var_2     2    0.1    1.00 (0.00) 0.00 (0.00) 1.01 (0.21)      12
 4 Democrat    var_3     1   -0.1    0.97 (0.17) 0.00 (0.00) 0.98 (0.24)      12
 5 Democrat    var_3     2   -0.1    0.95 (0.22) 0.00 (0.00) 1.05 (0.25)      12
 6 Democrat    var_3     3   -0.1    0.96 (0.20) 0.00 (0.00) 1.00 (0.25)      12
 7 Democrat    var_3     4    0.1    0.97 (0.17) 0.00 (0.00) 1.05 (0.26)      12
 8 Democrat    var_3     5   -0.03   0.22 (0.42) 0.00 (0.00) 2.27 (0.39)      12
 9 Independent var_1     1    0.1    0.98 (0.14) 0.00 (0.00) 1.00 (0.23)      12
10 Independent var_2     1   -0.2    1.00 (0.00) 0.00 (0.00) 0.94 (0.15)      12
11 Independent var_2     2    0.05   0.41 (0.49) 0.00 (0.00) 1.60 (0.40)      12
12 Independent var_3     1   -0.1    0.49 (0.50) 0.00 (0.00) 1.12 (0.28)      12
13 Independent var_3     2    0.1    0.57 (0.50) 0.00 (0.00) 1.25 (0.32)      12
14 Independent var_3     3    0.1    0.60 (0.49) 0.00 (0.00) 1.17 (0.29)      12
15 Independent var_3     4    0.3    1.00 (0.00) 0.00 (0.00) 1.00 (0.12)      12
16 Independent var_3     5    0.01   0.03 (0.17) 0.00 (0.00) 10.55 (1.22)     12
17 Republican  var_1     1    0.1    1.00 (0.00) 0.00 (0.00) 0.98 (0.14)      12
18 Republican  var_2     1   -0.1    1.00 (0.00) 0.00 (0.00) 0.96 (0.19)      12
19 Republican  var_2     2   -0.0005 0.04 (0.20) 0.50 (0.58) 5.99 (103.52)    12
20 Republican  var_3     1   -0.1    0.85 (0.36) 0.00 (0.00) 1.01 (0.23)      12
21 Republican  var_3     2    0.2    1.00 (0.00) 0.00 (0.00) 1.04 (0.13)      12
22 Republican  var_3     3   -0.1    0.91 (0.29) 0.00 (0.00) 0.94 (0.20)      12
23 Republican  var_3     4    0.1    0.94 (0.24) 0.00 (0.00) 1.08 (0.28)      12
24 Republican  var_3     5   -0.01   0.07 (0.26) 0.43 (0.53) 0.98 (7.57)      12

10 Conjoint Mixture Model

  • We are going to use the Kirkland and Coppock (2018) conjoint experiment on partisan labels
  • The authors use CJ to asses the impact of partisan labels in non-partisan elections
  • Before getting into mixture modelling we need to load and prepare the data for the analysis
df_base <- readr::read_csv("https://github.com/albertostefanelli/conjoint_class/raw/master/data/Kirkland_Coppock_mturk_replication.csv")

df_base <- df_base  %>% 
       mutate_at(vars(Political,Job,Party,Gender,Age,Race), as.factor) %>% 
       filter(Party != "non-partisan") %>% 
       mutate_at(vars(Party), droplevels)


# Get same reference categories as Kirkland & Coppock (2018)
df_base$Political <- relevel(df_base$Political, ref = "None")
df_base$Job <- relevel(df_base$Job, ref = "Educator")
df_base$Party <- relevel(df_base$Party, ref = "Independent")
df_base$Gender <- relevel(df_base$Gender, ref = "Female")
df_base$Age <- relevel(df_base$Age, ref = "35")
df_base$Race <- relevel(df_base$Race, ref = "White")
  • We are going to git a pooled model disregarding, for the moment, any group differences
# model formula 
frml <- win ~ Party + Political + Job + Race + Age + Gender

# Pooled Model
fit_1 <- amce(df_base,
              frml,
              id = ~ resp_mturkid)


# let's extract the coefficents to plot only the effect of the attribute party
m0pool <- fit_1 %>%
  filter(feature == "Party") %>%
  mutate(model = "Pooled") %>%
  dplyr::select(model, level, estimate, std.error, lower, upper)

# Pooled Visualisation
vis1 <- ggplot(m0pool, aes(x = level, y = estimate)) +
  geom_point() +
  geom_segment(aes(x = level, xend = level,
                   y = lower, yend = upper)) +
  theme_bw() +
  theme(
        plot.title = element_text(size = 10),
        axis.text.x = element_text(size = 7)
        ) +
  xlab("") + ylab("") +
  geom_hline(yintercept = 0, color = "black", linetype = "dashed") +
  scale_y_continuous(breaks = c(-0.3, -0.2, -0.1, 0, 0.1)) +
  coord_flip(ylim = c(-0.35, 0.15))

vis1

- In the pooled model, respondents would rather vote for Independent candidates than either Democrats or Republicans. Such results are typical for conjoints with unobserved subgroups. - Let’s now split the sample by respondent party ID and fit a model for each group. This is similar to what the cregg package does when you specify the grouping variable using by=Party

# Interaction: Democrats
mcp_d <- filter(df_base, resp_pid_3_text == "Democrat")
fit_2d <- amce(mcp_d,
               frml,
               id = ~ resp_mturkid,
               level_order = "descending")
fit_2d$resp_party <- "Democrat"

# Interaction: Republican
mcp_r <- filter(df_base, resp_pid_3_text == "Republican")
fit_2r <- amce(mcp_r,
               frml,
               id = ~ resp_mturkid,
               level_order = "descending")
fit_2r$resp_party <- "Republican"

# Interaction Model
m1inter <- rbind(fit_2d, fit_2r) %>%
  filter(feature == "Party") %>%
  mutate(model = "Interaction") %>%
  mutate(Respondent = resp_party) %>%
  dplyr::select(model, level, estimate, std.error, lower, upper, Respondent)

# Interaction Visualisation
vis2 <- m1inter %>%
  mutate(image = ifelse(Respondent == "Democrat",
                        "http://clipart-library.com/images/BiaKRMaBT.gif",
                        "https://i1.wp.com/gifgifs.com//animations/jobs-people/politicians/Republican_elephant.gif")) %>%
  mutate(image = ifelse(level == "Independent",
                        NA, image)) %>%
  ggplot(aes(x = level, y = estimate,
             group = Respondent,
             color = Respondent,
             shape = Respondent)) +
  geom_hline(yintercept = 0, color = "grey30", linetype = "dashed",
             alpha = 0.8) +
  geom_point(aes(y = estimate), position = position_dodge(width = 0.6)) +
  geom_linerange(aes(xmin = level, xmax = level,
                     ymin = lower, ymax = upper),
                 position = position_dodge(width = 0.6)) +
  geom_image(aes(y = estimate + 0.003,
                 image = image),
             size= .07) +
  theme_bw() +
  xlab("") + ylab("") +
  scale_y_continuous(breaks = c(-0.3, -0.2, -0.1, 0, 0.1)) +
  coord_flip(ylim = c(-0.35, 0.15)) +
  theme(
    legend.position = "none",
    plot.title = element_text(size = 10),
    axis.text.x = element_text(size = 7)
  ) +
  scale_colour_manual(
    values = c("#1404BD", "#DE0100"),
    aesthetics = c("colour", "fill")
  )

vis2

- Splitting the sample by respondent’s party identification, we see a clear pattern: - Democrats are unlikely to vote for Republicans and vice-versa. - These sub-groups have an effect of different magnitude and direction. - Let’s now see if we can reproduce these results without observing respondent’s party identification

# tell flexmix that the task are nested within respondents
f1_univ <- update(frml, ~ . | resp_mturkid)

# Model without concomitant variables
set.seed(1402)

fmod0 <- flexmix(f1_univ, data = df_base, k = 2,
                 model = FLXglm(family = "gaussian"),
                 #concomitant = FLXPmultinom(~ democrat + republican)
                 )

parameters(fmod0)[2:3, ]
                          Comp.1       Comp.2
coef.PartyDemocrat   -0.28173962  0.005755052
coef.PartyRepublican  0.07770004 -0.287800220
# we need re-fit the model to obtain sd. errors 
rfmod0 <- refit(fmod0)
fit_3 <- summary(rfmod0)
$Comp.1
                                      Estimate Std. Error z value      Pr(>|z|)    
(Intercept)                          0.3064256  0.0906525  3.3802     0.0007243 ***
PartyDemocrat                       -0.3147047  0.0582583 -5.4019 0.00000006594 ***
PartyRepublican                      0.1287631  0.0710434  1.8125     0.0699155 .  
PoliticalCity Council Member         0.0616649  0.0612786  1.0063     0.3142698    
PoliticalMayor                       0.1823083  0.0610538  2.9860     0.0028263 ** 
PoliticalRepresentative in Congress  0.3347928  0.0640892  5.2239 0.00000017524 ***
PoliticalState Legislator            0.2139046  0.0618667  3.4575     0.0005452 ***
JobAttorney                          0.2022774  0.0686573  2.9462     0.0032172 ** 
JobBusiness Executive                0.1547245  0.0703626  2.1990     0.0278808 *  
JobPolice Officer                    0.0021042  0.0685207  0.0307     0.9755022    
JobSmall Business Owner              0.2528794  0.0825028  3.0651     0.0021760 ** 
RaceAsian                            0.0055452  0.0674145  0.0823     0.9344443    
RaceBlack                            0.0276454  0.0558808  0.4947     0.6207975    
RaceHispanic                         0.0188299  0.0598405  0.3147     0.7530137    
Age45                               -0.0528786  0.0582919 -0.9071     0.3643355    
Age55                                0.0312026  0.0579032  0.5389     0.5899727    
Age65                               -0.0281696  0.0577519 -0.4878     0.6257136    
GenderMale                          -0.0601059  0.0413175 -1.4547     0.1457433    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$Comp.2
                                      Estimate Std. Error  z value              Pr(>|z|)    
(Intercept)                          0.4737660  0.0370646  12.7822 < 0.00000000000000022 ***
PartyDemocrat                       -0.0080373  0.0246772  -0.3257             0.7446531    
PartyRepublican                     -0.2723647  0.0251804 -10.8165 < 0.00000000000000022 ***
PoliticalCity Council Member         0.1577626  0.0264130   5.9729 0.0000000023303537643 ***
PoliticalMayor                       0.2098926  0.0257427   8.1535 0.0000000000000003536 ***
PoliticalRepresentative in Congress  0.1104624  0.0299471   3.6886             0.0002255 ***
PoliticalState Legislator            0.1462051  0.0270628   5.4024 0.0000000657416942404 ***
JobAttorney                         -0.0485885  0.0275440  -1.7640             0.0777265 .  
JobBusiness Executive               -0.0509025  0.0270085  -1.8847             0.0594722 .  
JobPolice Officer                   -0.0223847  0.0277967  -0.8053             0.4206453    
JobSmall Business Owner             -0.0183782  0.0289217  -0.6354             0.5251376    
RaceAsian                            0.0039160  0.0259045   0.1512             0.8798410    
RaceBlack                            0.0215717  0.0239588   0.9004             0.3679249    
RaceHispanic                        -0.0245309  0.0245074  -1.0010             0.3168470    
Age45                                0.0707868  0.0255672   2.7687             0.0056287 ** 
Age55                                0.0504756  0.0240597   2.0979             0.0359116 *  
Age65                                0.0222732  0.0243413   0.9150             0.3601730    
GenderMale                          -0.0266516  0.0167021  -1.5957             0.1105564    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# extract estimates first component
fit_3r <- fit_3@components[[1]]$Comp.1@.Data[2:3, ]

fit_3r1 <- as.data.frame(fit_3r) %>%
  mutate(model = "CFMM",
         level = str_remove(row.names(fit_3r), "Party"),
         component = "Component 1")

# extract estimates second component 
fit_3d <- fit_3@components[[1]]$Comp.2@.Data[2:3, ]
fit_3d2 <- as.data.frame(fit_3d) %>%
  mutate(model = "CFMM",
         level = str_remove(row.names(fit_3d), "Party"),
         component = "Component 2")

# calculate CI
m2mix <- rbind(fit_3r1, fit_3d2) %>%
  mutate(estimate = Estimate,
         std.error = `Std. Error`,
         lower = estimate - 1.96 * std.error,
         upper = estimate + 1.96 * std.error) %>%
  dplyr::select(model, level, estimate, std.error, lower, upper, component)

m2mix$component <- forcats::fct_relevel(m2mix$component, "Component 1", "Component 2")

m2mix_vis <- m2mix %>%
  add_row(model = "CFMM",
          level = "Independent", 
          estimate = 0,
          std.error = NA,
          lower = NA,
          upper = NA,
          component = "Component 1") %>%
  add_row(model = "CFMM",
          level = "Independent", 
          estimate = 0,
          std.error = NA,
          lower = NA,
          upper = NA,
          component = "Component 2") %>%
  mutate(image = ifelse(component == "Component 2",
                        "http://clipart-library.com/images/BiaKRMaBT.gif",
                        "https://i1.wp.com/gifgifs.com//animations/jobs-people/politicians/Republican_elephant.gif")) %>%
  mutate(image = ifelse(level == "Independent",
                        NA, image))

m2mix_vis$level <- forcats::fct_relevel(m2mix_vis$level, rev(c("Independent", "Democrat", "Republican")))
m2mix_vis$component <- forcats::fct_relevel(m2mix_vis$component, "Component 2", "Component 1")

vis3 <- m2mix_vis %>%
ggplot(aes(x = level, y = estimate,
             group = component,
             color = component,
             shape = component)) +
  geom_hline(yintercept = 0, color = "grey30", linetype = "dashed",
             alpha = 0.8) +
  geom_point(aes(y = estimate), position = position_dodge(width = 0.6)) +
  geom_linerange(aes(xmin = level, xmax = level,
                     ymin = lower, ymax = upper),
                 position = position_dodge(width = 0.6)) +
  geom_image(aes(y = estimate + 0.003,
                 image = image),
             size= .07) +
  theme_bw() +
  xlab("") + ylab("") +
  coord_flip(ylim = c(-0.35, 0.15)) +
  theme(
    legend.position = "none",
    plot.title = element_text(size = 10),
    axis.text.x = element_text(size = 7)
  ) +
  scale_colour_manual(
    values = c("#1404BD", "#DE0100"),
    aesthetics = c("colour", "fill")
  )


## Composite visualisation
ggpubr::ggarrange(vis1 +
                    ggtitle("Pooled model") +
                    theme(
                      axis.text.y = element_text(face = "bold",
                                                 size = 9)
                    ), 
                  vis2 +
                    ggtitle("Split by observed cov.") +
                    theme(
                      axis.text.y = element_blank()
                    ), 
                  vis3 +
                    ggtitle("CFMM (no covariates)")+
                    theme(
                      axis.text.y = element_blank()
                    ),
                  ncol = 3, nrow = 1,
                  widths = c(3, 3, 3),
                  common.legend = F)

  • Using CFMM with two classes, we successfully identify the existing subgroups.
  • Standard errors capture the effect provided by Kirkland and Coppock (2018), without the need to include respondent’s party.

References

Cuesta, Brandon de la, Naoki Egami, and Kosuke Imai. 2022. “Improving the External Validity of Conjoint Analysis: The Essential Role of Profile Distribution.” Political Analysis 30 (1): 19–45. https://doi.org/10.1017/pan.2020.40.
Hainmueller, Jens, Daniel J. Hopkins, and Teppei Yamamoto. 2014. “Causal Inference in Conjoint Analysis: Understanding Multidimensional Choices via Stated Preference Experiments.” Political Analysis 22 (1): 1–30. https://doi.org/f5qzwp.
Kirkland, Patricia A., and Alexander Coppock. 2018. “Candidate Choice Without Party Labels:” Political Behavior 40 (3): 571–91. https://doi.org/gd4r43.