Home > Mobile >  Pivot data and match variable with multiple other variables
Pivot data and match variable with multiple other variables

Time:08-17

I have a dataset with friendship nominations and how the friends see each other. I'm trying to set it up as an edge list and still keep a flag if the respondents interact in person, online, or both. Sample data and desired output below:

Data
df<-read.table(text= "id    friend1 friend2 friend3 friend1_phone   friend1_online  friend1_irl friend2_phone   friend2_online  friend2_irl friend3_phone   friend3_online  friend3_irl
1   4   12  7   1   1   1   0   0   1   0   1   0
2   8   6   7   0   1   1   0   1   0   1   0   0
3   9   NA  NA  1   1   1   NA  NA  NA  NA  NA  NA
4   15  7   2   1   0   0   0   0   0   1   1   1
5   2   20  7   1   0   1   1   0   0   0   1   0
6   19  NA  9   1   0   0   NA  NA  NA  1   1   1
7   12  20  8   1   0   1   0   0   1   0   1   0
8   3   17  10  0   0   0   0   1   0   1   1   0
9   NA  15  19  NA  NA  NA  1   1   0   0   1   0
10  2   16  11  1   1   1   0   0   1   0   1   0", header = TRUE)
Expected output
df_long<-read.table(text= "id   alter   virtual_only    irl_only    both
1   4   0   0   1
1   12  0   1   0
1   7   1   0   0
2   8   0   0   1
2   6   1   0   0
2   7   1   0   0
3   9   0   0   1
4   15  1   0   0
4   7   1   0   0
4   2   0   0   1
5   2   0   0   1
5   20  1   0   0
5   7   0   0   1
6   19  1   0   0
6   9   0   0   1
7   12  0   0   1
7   20  0   1   0
7   8   1   0   0
8   3   0   1   0
8   17  1   0   0
8   10  1   0   0
9   15  1   0   0
9   19  1   0   0
10  2   0   0   1
10  16  0   1   0
10  11  1   0   0", header = TRUE)

CodePudding user response:

You could transform the data like this:

library(dplyr)
library(tidyr)

df %>%
  rename_with(~ paste0(.x, '_alter'), friend1:friend3) %>%
  pivot_longer(-id, names_to = c(NA, ".value"), names_sep = "_", values_drop_na = TRUE)

# # A tibble: 26 × 5
#       id alter phone online   irl
#    <int> <int> <int>  <int> <int>
#  1     1     4     1      1     1
#  2     1    12     0      0     1
#  3     1     7     0      1     0
#  4     2     8     0      1     1
#  5     2     6     0      1     0
#  6     2     7     1      0     0
#  7     3     9     1      1     1
#  8     4    15     1      0     0
#  9     4     7     0      0     0
# 10     4     2     1      1     1
# # … with 16 more rows

And then you could define virtual_only, irl_only, both columns according to phone, online, and irl by yourself. It can be easily achieved with mutate().

  • Related