I am trying to figure out the best way to transform and analyze this dataframe that has a column for each answer. I am trying to create a new column that displays the combination when multiple answers selected.
My original dataframe looks like this:
df <- data.frame('ID' = c("001", "002", "003", "004", "005", "006"),
'Answer_1' = c("1", "1", "0", "0", "1", "0"),
'Answer_2' = c("1", "0", "1", "0", "0", "1"),
'Answer_3' = c("1", "1", "0", "1", "0", "1"))
I am trying to consolidate those multiple answers into a singular column so I can run descriptives of the different answer combinations.
I'm looking to come up with a dataframe that looks like this:
df1 <- data.frame('ID' = c("001", "002", "003", "004", "005", "006"),
'Answer_4' = c("Answer_1 | Answer_2 | Answer_2", "Answer_1 | Answer_3",
"Answer_2", "Answer_3", "Answer_1", "Answer_2 | Answer_3"
CodePudding user response:
If I understood this will help you
Data
df <- data.frame('ID' = c("001", "002", "003", "004", "005", "006"),
'Answer_1' = c("1", "1", "0", "0", "1", "0"),
'Answer_2' = c("1", "0", "1", "0", "0", "1"),
'Answer_3' = c("1", "1", "0", "1", "0", "1"))
Code
collapse_str <- function(x){
paste0(na.omit(x),collapse = " | ")
}
df %>%
mutate(
across(
.cols = -ID,
.fns = function(x) ifelse(x == 1,cur_column(),NA_character_)
)
) %>%
rowwise() %>%
mutate(Answer_4 = collapse_str(c_across(cols = -ID)))
Output
# A tibble: 6 x 5
# Rowwise:
ID Answer_1 Answer_2 Answer_3 Answer_4
<chr> <chr> <chr> <chr> <chr>
1 001 Answer_1 Answer_2 Answer_3 Answer_1 | Answer_2 | Answer_3
2 002 Answer_1 NA Answer_3 Answer_1 | Answer_3
3 003 NA Answer_2 NA Answer_2
4 004 NA NA Answer_3 Answer_3
5 005 Answer_1 NA NA Answer_1
6 006 NA Answer_2 Answer_3 Answer_2 | Answer_3
CodePudding user response:
You could convert the 0/1 as.logical
and subset the names
. The 0/1 should be numeric though, I thus added type.convert
.
df[1] |>
cbind(Answer_4=sapply(asplit(type.convert(df[-1], as.is=TRUE), 1), \(x) {
names(df[-1])[as.logical(x)] |>
paste(collapse=' | ')
}))
# ID Answer_4
# 1 001 Answer_1 | Answer_2 | Answer_3
# 2 002 Answer_1 | Answer_3
# 3 003 Answer_2
# 4 004 Answer_3
# 5 005 Answer_1
# 6 006 Answer_2 | Answer_3
Data:
df <- structure(list(ID = c("001", "002", "003", "004", "005", "006"
), Answer_1 = c("1", "1", "0", "0", "1", "0"), Answer_2 = c("1",
"0", "1", "0", "0", "1"), Answer_3 = c("1", "1", "0", "1", "0",
"1")), class = "data.frame", row.names = c(NA, -6L))