I have two different variables in R. The first ("candimmi") represents political candidates' opinion on immigration. The second variable (voterimmi) represents voters opinion on immigration. Both variables have the same 3 levels being either anti-immigration, intermediate or pro-immigration.
My issue is that I want to create a new variable stating wether there is congruence or not between the voter and the political candidates. The levels in the new variable would be called "both anti-immigrant", "both intermediate", "both pro-immigration" and "mismatch".
Can any of you give me some advice on how to do this?
Thanks in advance!
Best, Malte
I have tried finding solutions already, but can't find any answers to my question online.
CodePudding user response:
You can use case_when
, which is just dplyr
's version of ifelse
:
set.seed(05062020)
library(dplyr)
responses <- c("Anti","Intermed","Pro")
df <- data.frame(candidate = sample(responses, 10, replace = TRUE),
voter = sample(responses, 10, replace = TRUE))
df2 <- df %>% mutate(result = case_when(candidate %in% "Anti" & voter %in% "Anti" ~ "Both Anti",
candidate %in% "Intermed" & voter %in% "Intermed" ~ "Both Intermed",
candidate %in% "Pro" & voter %in% "Pro" ~ "Both Pro",
candidate != voter ~ "Discordant"))
# candidate voter result
# 1 Pro Intermed Discordant
# 2 Anti Anti Both Anti
# 3 Pro Pro Both Pro
# 4 Pro Anti Discordant
# 5 Pro Anti Discordant
# 6 Pro Pro Both Pro
# 7 Pro Intermed Discordant
# 8 Intermed Pro Discordant
# 9 Intermed Intermed Both Intermed
# 10 Anti Pro Discordant
A base R way to do it is using nested ifelse
statements:
df$result <- ifelse(df$candidate %in% "Anti" & df$voter %in% "Anti", "Both Anti",
ifelse(df$candidate %in% "Intermed" & df$voter %in% "Intermed", "Both Intermed",
ifelse(df$candidate %in% "Pro" & df$voter %in% "Pro", "Both Pro",
ifelse(df$candidate != df$voter, "Discordant", NA))))
# > df
# candidate voter result
# 1 Pro Intermed Discordant
# 2 Anti Anti Both Anti
# 3 Pro Pro Both Pro
# 4 Pro Anti Discordant
# 5 Pro Anti Discordant
# 6 Pro Pro Both Pro
# 7 Pro Intermed Discordant
# 8 Intermed Pro Discordant
# 9 Intermed Intermed Both Intermed
# 10 Anti Pro Discordant
CodePudding user response:
Here is a simple approach using base R functions factor
and interaction
(using @jpsmith example data.frame with different random seed). At the core of this, interaction
will automatically create a new factor with combined levels, then you can just rename these if you like (might be useful with many factor levels).
set.seed(234) # fixed random seed for reproducibility
responses <- c("Anti", "Intermed", "Pro")
congruence <- c("both anti-immigrant", "both intermediate", "both pro-immigration", "mismatch")
df <- data.frame(candidate = sample(responses, 10, replace = TRUE),
voter = sample(responses, 10, replace = TRUE))
df$candidate <- factor(df$candidate, levels=responses) # make sure you have all the levels
df$voter <- factor(df$voter, levels=responses) # make sure you have all the levels
df$congruence <- with(df, interaction(candidate, voter)) # create new factor representing both levels
levels(df$congruence) <- congruence[c(1,4,4,4,2,4,4,4,3)] # match up factor levels to rename
df
#> candidate voter congruence
#> 1 Anti Pro mismatch
#> 2 Pro Pro both pro-immigration
#> 3 Intermed Intermed both intermediate
#> 4 Intermed Pro mismatch
#> 5 Intermed Intermed both intermediate
#> 6 Intermed Intermed both intermediate
#> 7 Anti Anti both anti-immigrant
#> 8 Anti Anti both anti-immigrant
#> 9 Pro Intermed mismatch
#> 10 Intermed Pro mismatch
Created on 2022-04-05 by the reprex package (v2.0.1)
CodePudding user response:
Both of the other answers work fine, but the simplest solution is to use just one ifelse()
. Below I first create some sample data and then show how you would use ifelse()
in either the tidyverse or base R if you prefer.
library(tidyverse)
# Create data sample
d <- crossing(
candimmi = c("anti", "inter", "pro"),
voterimmi = candimmi
)
d |>
mutate(new_tidy = ifelse(candimmi != voterimmi,
"mismatch",
str_c("both ", candimmi)))
#> # A tibble: 9 × 3
#> candimmi voterimmi new_tidy
#> <chr> <chr> <chr>
#> 1 anti anti both anti
#> 2 anti inter mismatch
#> 3 anti pro mismatch
#> 4 inter anti mismatch
#> 5 inter inter both inter
#> 6 inter pro mismatch
#> 7 pro anti mismatch
#> 8 pro inter mismatch
#> 9 pro pro both pro
d$new_base <- ifelse(d$candimmi != d$voterimmi,
"mismatch",
paste("both", d$candimmi))
d
#> # A tibble: 9 × 3
#> candimmi voterimmi new_base
#> <chr> <chr> <chr>
#> 1 anti anti both anti
#> 2 anti inter mismatch
#> 3 anti pro mismatch
#> 4 inter anti mismatch
#> 5 inter inter both inter
#> 6 inter pro mismatch
#> 7 pro anti mismatch
#> 8 pro inter mismatch
#> 9 pro pro both pro
Created on 2022-04-05 by the reprex package (v2.0.1)