Home > database >  How to convert a daatset where some subjects chose multiple answers into a dummy variables format?
How to convert a daatset where some subjects chose multiple answers into a dummy variables format?

Time:10-06

I have this example dataset

df <- data.frame(subjects = 1:12,
                 Why_are_you_not_happy = 
                   c(1,2,"1,2,5",5,1,2,"3,4",3,2,"1,5",3,4),
                 why_are_you_sad = 
                   c("1,2,3",1,2,3,"4,5,3",2,1,4,3,1,1,1) )

And would like to convert it into a dummy variables format (based on the 5 answers of each question). Can someone guide me through an effective way ? thanks.

CodePudding user response:

You can separate_rows for multiple choices, convert to dummy and summarise by subjects (to get one row per subjects, with all their choices).

library(fastDummies)
library(tidyr)
library(dplyr)

df %>% 
  separate_rows(Why_are_you_not_happy, why_are_you_sad) %>% 
  dummy_cols(c("Why_are_you_not_happy", "why_are_you_sad"),
             remove_selected_columns = TRUE) %>% 
  group_by(subjects) %>% 
  summarise(across(everything(), max))

output

# A tibble: 12 × 11
   subjects Why_are_you…¹ Why_a…² Why_a…³ Why_a…⁴ Why_a…⁵ why_a…⁶ why_a…⁷ why_a…⁸ why_a…⁹ why_a…˟
      <int>         <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>
 1        1             1       0       0       0       0       1       1       1       0       0
 2        2             0       1       0       0       0       1       0       0       0       0
 3        3             1       1       0       0       1       0       1       0       0       0
 4        4             0       0       0       0       1       0       0       1       0       0
 5        5             1       0       0       0       0       0       0       1       1       1
 6        6             0       1       0       0       0       0       1       0       0       0
 7        7             0       0       1       1       0       1       0       0       0       0
 8        8             0       0       1       0       0       0       0       0       1       0
 9        9             0       1       0       0       0       0       0       1       0       0
10       10             1       0       0       0       1       1       0       0       0       0
11       11             0       0       1       0       0       1       0       0       0       0
12       12             0       0       0       1       0       1       0       0       0       0
  • Related