Is there a way to summarize occurrences of variable values by another variable?
It's similar to pivoting from long to wide, but pivoting is done into a vector rather than into multiple variables
data have:
| var1 | var2 |
| :--: |:------:|
| 1 | 2 |
| 1 | 4 |
| 1 | 4 |
| 1 | 4 |
| 1 | 6 |
| 2 | 8 |
| 2 | 8 |
| 2 | 10 |
| 2 | 12 |
data want:
| var1 | var2 |
| :--: |:---------:|
| 1 | (2, 4, 6) |
| 2 | (8,10,12) |
CodePudding user response:
We could create a list
column after getting the unique
elements
library(dplyr)
df1 %>%
distinct %>%
group_by(var1) %>%
summarise(var2 = list(var2))
CodePudding user response:
A base R approach with aggregate
aggregate(. ~ var1, df, function(x) list(unique(x)))
var1 var2
1 1 2, 4, 6
2 2 8, 10, 12