I have a dataframe:
participants <- c(A, A, A, B, C, C)
answers <- c(alpha, beta, gamma, beta, beta, gamma)
participants answers
A alpha
A beta
A gamma
B beta
C beta
C gamma
The 'answers' column contains many more than just this little set.
how do I make it into binary features like the following:
participant answers value
A alpha 1
A beta 1
A gamma 1
B alpha 0
B beta 1
B gamma 0
C alpha 0
C beta 1
C gamma 1
My guess is that I have to get the levels of the 'answers' and the 'participants' too?
But I'm not sure how to do it next. Thanks!
CodePudding user response:
In base R you could do:
data.frame(table(df1))
participants answers Freq
1 A alpha 1
2 B alpha 0
3 C alpha 0
4 A beta 1
5 B beta 1
6 C beta 1
7 A gamma 1
8 B gamma 0
9 C gamma 1
The above is not ordered the same way as your table. To do that, you could do:
with(a<-data.frame(table(df1)), a[order(participants),])
participants answers Freq
1 A alpha 1
4 A beta 1
7 A gamma 1
2 B alpha 0
5 B beta 1
8 B gamma 0
3 C alpha 0
6 C beta 1
9 C gamma 1
CodePudding user response:
If the original data is 'df1', use complete
after creating a column of 1s
library(tidyr)
library(dplyr)
df1 %>%
mutate(value = 1) %>%
complete(participants, answers, fill = list(value = 0))
-output
# A tibble: 9 × 3
participants answers value
<chr> <chr> <dbl>
1 A alpha 1
2 A beta 1
3 A gamma 1
4 B alpha 0
5 B beta 1
6 B gamma 0
7 C alpha 0
8 C beta 1
9 C gamma 1
data
df1 <- structure(list(participants = c("A", "A", "A", "B", "C", "C"),
answers = c("alpha", "beta", "gamma", "beta", "beta", "gamma"
)), class = "data.frame", row.names = c(NA, -6L))