I have a following dataframe:
df1 <- structure(list(name = c("ene", "due", "rabe", "rabe", "kum",
"kum", "kum", "rike", "smake"), type = c("a", "b", "d", "a",
"c", "c", "b", "d", "a")), class = "data.frame", row.names = c(NA,
-9L))
And I would like to transform it to the following dataframe:
df2 <- structure(list(name = c("ene", "due", "rabe", "kum", "rike",
"smake"), type_a = c(1, 0, 1, 0, 0, 1), type_b = c(0, 1, 0, 1,
0, 0), type_c = c(0, 0, 0, 2, 0, 0), type_d = c(0, 0, 1, 0, 1,
0)), class = "data.frame", row.names = c(NA, -6L))
Basically I want to split "type" column for as many columns as categories stored with the original one. Also, instead of character values I would like to count the occurences of each category per name.
How to do it in R?
EDIT:I tried to do so with spread from tidyr, but it throws an error due to non-unique combination of keys.
CodePudding user response:
You could take advantage of different arguments of pivot_wider
to construct the contingency table.
library(tidyr)
pivot_wider(df1, names_from = type, names_sort = TRUE, names_prefix = 'type_',
values_from = type, values_fn = length, values_fill = 0)
# # A tibble: 6 × 5
# name type_a type_b type_c type_d
# <chr> <int> <int> <int> <int>
# 1 ene 1 0 0 0
# 2 due 0 1 0 0
# 3 rabe 1 0 0 1
# 4 kum 0 1 2 0
# 5 rike 0 0 0 1
# 6 smake 1 0 0 0
CodePudding user response:
library(dplyr)
library(tidyr)
df1 %>%
count(name, type) %>%
pivot_wider(names_from = type, values_from = n, values_fill = 0) %>%
rename_with(~ paste0("type_", .x))