Home > Net >  pivot_wider and unnest for a survey dataset questions needed to be reorganised
pivot_wider and unnest for a survey dataset questions needed to be reorganised

Time:08-21

I have this example data set that is consisted of 2 countries (Austria and France), 3 subsets (child, teenager, adult), 2 questions and 5 types of answers. This dataset looks as follows :

country   subset                      Questions          Answers
1  Austria    child            Do you feel lonely?            agree
2  Austria    child            Do you feel lonely?         disagree
3  Austria    child            Do you feel lonely?    totally agree
4  Austria teenager            Do you feel lonely?       don't know
5  Austria teenager            Do you feel lonely?         disagree
6  Austria teenager            Do you feel lonely? totally disagree
7  Austria    adult            Do you feel lonely?            agree
8  Austria    adult            Do you feel lonely?         disagree
9  Austria    adult            Do you feel lonely?    totally agree
10  France    child            Do you feel lonely?       don't know
11  France    child            Do you feel lonely?         disagree
12  France    child            Do you feel lonely? totally disagree
13  France teenager            Do you feel lonely?            agree
14  France teenager            Do you feel lonely?         disagree
15  France teenager            Do you feel lonely?    totally agree
16  France    adult            Do you feel lonely?       don't know
17  France    adult            Do you feel lonely?         disagree
18  France    adult            Do you feel lonely? totally disagree
19 Austria    child Do you prefer staying at home?            agree
20 Austria    child Do you prefer staying at home?         disagree
21 Austria    child Do you prefer staying at home?    totally agree
22 Austria teenager Do you prefer staying at home?       don't know
23 Austria teenager Do you prefer staying at home?         disagree
24 Austria teenager Do you prefer staying at home? totally disagree
25 Austria    adult Do you prefer staying at home?            agree
26 Austria    adult Do you prefer staying at home?         disagree
27 Austria    adult Do you prefer staying at home?    totally agree
28  France    child Do you prefer staying at home?       don't know
29  France    child Do you prefer staying at home?         disagree
30  France    child Do you prefer staying at home? totally disagree
31  France teenager Do you prefer staying at home?            agree
32  France teenager Do you prefer staying at home?         disagree
33  France teenager Do you prefer staying at home?    totally agree
34  France    adult Do you prefer staying at home?       don't know
35  France    adult Do you prefer staying at home?         disagree
36  France    adult Do you prefer staying at home? totally disagree

What I am looking for is that the previous dataset becomes as follows :

country   subset Do_you_feel_lonely Do_you_prefer_staying_at_home
1  Austria    child              agree                         agree
2  Austria    child           disagree                      disagree
3  Austria    child      totally agree                 totally agree
4  Austria teenager         don't know                    don't know
5  Austria teenager           disagree                      disagree
6  Austria teenager   totally disagree              totally disagree
7  Austria    adult              agree                         agree
8  Austria    adult           disagree                      disagree
9  Austria    adult      totally agree                 totally agree
10  France    child         don't know                    don't know
11  France    child           disagree                      disagree
12  France    child   totally disagree              totally disagree
13  France teenager              agree                         agree
14  France teenager           disagree                      disagree
15  France teenager      totally agree                 totally agree
16  France    adult         don't know                    don't know
17  France    adult           disagree                      disagree
18  France    adult   totally disagree              totally disagree

I already checked other similar questions that included the functions reshape2::dcast and tidyr::spread, but I guess I am making some error and I am not able to figure it out on my own.

CodePudding user response:

This type or re-shaping is a pivot to wide format. You can achieve it with:

library(tidyverse)

df %>%
pivot_wider(names_from = Questions, values_from = Answers, values_fn = list) %>% 
unnest(3:4)
#> # A tibble: 18 x 4
#>    country subset   `Do you feel lonely?` `Do you prefer staying at home?`
#>    <chr>   <chr>    <chr>                 <chr>                           
#>  1 Austria child    agree                 agree                           
#>  2 Austria child    disagree              disagree                        
#>  3 Austria child    totally agree         totally agree                   
#>  4 Austria teenager don't know            don't know                      
#>  5 Austria teenager disagree              disagree                        
#>  6 Austria teenager totally disagree      totally disagree                
#>  7 Austria adult    agree                 agree                           
#>  8 Austria adult    disagree              disagree                        
#>  9 Austria adult    totally agree         totally agree                   
#> 10 France  child    don't know            don't know                      
#> 11 France  child    disagree              disagree                        
#> 12 France  child    totally disagree      totally disagree                
#> 13 France  teenager agree                 agree                           
#> 14 France  teenager disagree              disagree                        
#> 15 France  teenager totally agree         totally agree                   
#> 16 France  adult    don't know            don't know                      
#> 17 France  adult    disagree              disagree                        
#> 18 France  adult    totally disagree      totally disagree

Created on 2022-08-20 with reprex v2.0.2


Data in reproducible format

df <- structure(list(country = c("Austria", "Austria", "Austria", "Austria", 
"Austria", "Austria", "Austria", "Austria", "Austria", "France", 
"France", "France", "France", "France", "France", "France", "France", 
"France", "Austria", "Austria", "Austria", "Austria", "Austria", 
"Austria", "Austria", "Austria", "Austria", "France", "France", 
"France", "France", "France", "France", "France", "France", "France"
), subset = c("child", "child", "child", "teenager", "teenager", 
"teenager", "adult", "adult", "adult", "child", "child", "child", 
"teenager", "teenager", "teenager", "adult", "adult", "adult", 
"child", "child", "child", "teenager", "teenager", "teenager", 
"adult", "adult", "adult", "child", "child", "child", "teenager", 
"teenager", "teenager", "adult", "adult", "adult"), 
Questions = c("Do you feel lonely?", 
"Do you feel lonely?", "Do you feel lonely?", "Do you feel lonely?", 
"Do you feel lonely?", "Do you feel lonely?", "Do you feel lonely?", 
"Do you feel lonely?", "Do you feel lonely?", "Do you feel lonely?", 
"Do you feel lonely?", "Do you feel lonely?", "Do you feel lonely?", 
"Do you feel lonely?", "Do you feel lonely?", "Do you feel lonely?", 
"Do you feel lonely?", "Do you feel lonely?", "Do you prefer staying at home?", 
"Do you prefer staying at home?", "Do you prefer staying at home?", 
"Do you prefer staying at home?", "Do you prefer staying at home?", 
"Do you prefer staying at home?", "Do you prefer staying at home?", 
"Do you prefer staying at home?", "Do you prefer staying at home?", 
"Do you prefer staying at home?", "Do you prefer staying at home?", 
"Do you prefer staying at home?", "Do you prefer staying at home?", 
"Do you prefer staying at home?", "Do you prefer staying at home?", 
"Do you prefer staying at home?", "Do you prefer staying at home?", 
"Do you prefer staying at home?"), Answers = c("agree", "disagree", 
"totally agree", "don't know", "disagree", "totally disagree", 
"agree", "disagree", "totally agree", "don't know", "disagree", 
"totally disagree", "agree", "disagree", "totally agree", "don't know", 
"disagree", "totally disagree", "agree", "disagree", "totally agree", 
"don't know", "disagree", "totally disagree", "agree", "disagree", 
"totally agree", "don't know", "disagree", "totally disagree", 
"agree", "disagree", "totally agree", "don't know", "disagree", 
"totally disagree")), class = "data.frame", row.names = c(NA, 
-36L))
  • Related