I have a data frame that looks similar to this:
case_id | pol_demo | pol_demo_online | pol_post_online | pol_petition | pol_cntct_polit | pol_party | pol_other | pol_demo_illegal |
---|---|---|---|---|---|---|---|---|
1311 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
97 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
5480 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2531 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
2291 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
2064 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
The case_id corresponds to a unique id of a survey respondent, and it has as many unique values as the number of rows in the data frame. The columns are forms of behavior, that the respective respondent did (1), or did not do (0). There are more columns than on the picture (17), but you get the idea.
I would like to have this transformed into an adjacency matrix. That is, I would like to have a matrix with 17 rows and 17 columns (one for each behavior), and the cells should be the number of times different forms of behavior are combined (the sum for each pair). So using the columns in the example the matrix would look like:
pol_demo | pol_demo_online | pol_post_online | pol_petition | pol_cntct_polit | pol_party | pol_other | pol_demo_illegal | |
---|---|---|---|---|---|---|---|---|
pol_demo | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_demo_online | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_post_online | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_petition | 3 | 3 | 3 | 4 | 3 | 3 | 3 | 2 |
pol_cntct_polit | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_party | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_other | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_demo_illegal | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
Meaning pol_demo is 3 times combined with pol_demo_online, but only 2 times with pol_demo_illegal, etc.
I have a hunch that I need to work with outer
, but I could not figure it out. I would be happy for a tidy solution, but really, any help is much appreciated!
This is a snippet of the data:
dat <- structure(list(pol_demo = c(1, 0, 0, 1, 0, 1), pol_demo_online = c(1,
0, 0, 1, 0, 1), pol_post_online = c(1, 0, 0, 1, 0, 1), pol_petition = c(1,
0, 0, 1, 1, 1), pol_cntct_polit = c(1, 0, 0, 1, 0, 1), pol_party = c(1,
0, 0, 1, 0, 1), pol_other = c(1, 0, 0, 1, 0, 1), pol_demo_illegal = c(0,
0, 0, 1, 0, 1), help_shopping = c(1, 1, 0, 1, 1, 1), help_childcare = c(0,
1, 0, 1, 0, 1), help_general = c(1, 1, 0, 1, 1, 1), help_emo = c(1,
1, 0, 1, 1, 1), help_symb = c(1, 1, 0, 1, 0, 1), help_fin = c(1,
0, 0, 1, 1, 1), help_donation = c(1, 0, 0, 1, 1, 1), help_volunteer = c(1,
0, 0, 1, 0, 1), help_other = c(1, 1, 0, 1, 0, 1), case_id = structure(c(1311,
97, 548, 2531, 2291, 2064), label = "numerical, unique ID per respondent", format.stata = "%9.0g")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
CodePudding user response:
I found a solution in an unrelated question here. I rearranged dat with pivote_longer() to get it into a format that would work with that solution.
dat <- dat %>%
pivot_longer(cols = -case_id,
names_to = "name",
values_to = "val") %>%
filter(val != 0) %>%
select(id = case_id, name)
Then using the section from the linked answer modified for your data.
# create a 2-mode sociomatrix
mat <- t(table(dat))
# create adjacency matrix as product of the 2-mode sociomatrix
adj.mat <- mat %*% t(mat)
edit: originally copy and pasted the wrong pivot_longer() this one works.
CodePudding user response:
Another approach is to use pairwise operations on all columns and then we just need to turn the output into a matrix.
There are quiet a few packages that make pairwise operations possible: corrr::colpair_map
, pwiser::pairwise
, dplyover::across2x
.
Below is one approach using dplyover::across2x
(disclaimer: I'm the maintainer).
library(dplyr)
library(dplyover) # https://github.com/TimTeaFan/dplyover
colnms <- dat %>% select(-case_id) %>% colnames
out <- dat %>%
summarise(across2x(-case_id,
-case_id,
~ sum(.x == 1 & .y == 1)))
matrix(out,
ncol = 17,
dimnames = list(colnms, colnms))
#> pol_demo pol_demo_online pol_post_online pol_petition
#> pol_demo 3 3 3 3
#> pol_demo_online 3 3 3 3
#> pol_post_online 3 3 3 3
#> pol_petition 3 3 3 4
#> pol_cntct_polit 3 3 3 3
#> pol_party 3 3 3 3
#> pol_other 3 3 3 3
#> pol_demo_illegal 2 2 2 2
#> help_shopping 3 3 3 4
#> help_childcare 2 2 2 2
#> help_general 3 3 3 4
#> help_emo 3 3 3 4
#> help_symb 3 3 3 3
#> help_fin 3 3 3 4
#> help_donation 3 3 3 4
#> help_volunteer 3 3 3 3
#> help_other 3 3 3 3
#> pol_cntct_polit pol_party pol_other pol_demo_illegal
#> pol_demo 3 3 3 2
#> pol_demo_online 3 3 3 2
#> pol_post_online 3 3 3 2
#> pol_petition 3 3 3 2
#> pol_cntct_polit 3 3 3 2
#> pol_party 3 3 3 2
#> pol_other 3 3 3 2
#> pol_demo_illegal 2 2 2 2
#> help_shopping 3 3 3 2
#> help_childcare 2 2 2 2
#> help_general 3 3 3 2
#> help_emo 3 3 3 2
#> help_symb 3 3 3 2
#> help_fin 3 3 3 2
#> help_donation 3 3 3 2
#> help_volunteer 3 3 3 2
#> help_other 3 3 3 2
#> help_shopping help_childcare help_general help_emo help_symb
#> pol_demo 3 2 3 3 3
#> pol_demo_online 3 2 3 3 3
#> pol_post_online 3 2 3 3 3
#> pol_petition 4 2 4 4 3
#> pol_cntct_polit 3 2 3 3 3
#> pol_party 3 2 3 3 3
#> pol_other 3 2 3 3 3
#> pol_demo_illegal 2 2 2 2 2
#> help_shopping 5 3 5 5 4
#> help_childcare 3 3 3 3 3
#> help_general 5 3 5 5 4
#> help_emo 5 3 5 5 4
#> help_symb 4 3 4 4 4
#> help_fin 4 2 4 4 3
#> help_donation 4 2 4 4 3
#> help_volunteer 3 2 3 3 3
#> help_other 4 3 4 4 4
#> help_fin help_donation help_volunteer help_other
#> pol_demo 3 3 3 3
#> pol_demo_online 3 3 3 3
#> pol_post_online 3 3 3 3
#> pol_petition 4 4 3 3
#> pol_cntct_polit 3 3 3 3
#> pol_party 3 3 3 3
#> pol_other 3 3 3 3
#> pol_demo_illegal 2 2 2 2
#> help_shopping 4 4 3 4
#> help_childcare 2 2 2 3
#> help_general 4 4 3 4
#> help_emo 4 4 3 4
#> help_symb 3 3 3 4
#> help_fin 4 4 3 3
#> help_donation 4 4 3 3
#> help_volunteer 3 3 3 3
#> help_other 3 3 3 4
Created on 2021-09-23 by the reprex package (v2.0.1)