I import a csv into a dataframe with this structure:
id brand p_1 p_2 p_3 p_4 p_5
1 A 1 2 5
2 B 2 3
3 C 3
4 B 1
5 A 2
And I would like to first get it into this structure
p A B C
1 1 1 0
2 2 1 0
3 0 1 1
4 0 0 0
5 1 0 0
So it counts all combinations of values BUT is also counts non existing ones such as 4 that does no appear YET is a value among 1 (min) and 5 (max), and this is the tricky part!
Thanks!
CodePudding user response:
df %>%
pivot_longer(-(1:2)) %>%
filter(!is.na(value)) %>%
count(value, brand) %>%
complete(value = 1:5, brand, fill = list(n=0)) %>%
pivot_wider(names_from = brand, values_from = n, values_fill = 0)
result
# A tibble: 5 × 4
value A B C
<int> <int> <int> <int>
1 1 1 1 0
2 2 2 1 0
3 3 0 1 1
4 4 0 0 0
5 5 1 0 0
source data
df <- data.frame(
stringsAsFactors = FALSE,
id = c(1L, 2L, 3L, 4L, 5L),
brand = c("A", "B", "C", "B", "A"),
p_1 = c(1L, 2L, 3L, 1L, 2L),
p_2 = c(2L, 3L, NA, NA, NA),
p_3 = c(5L, NA, NA, NA, NA),
p_4 = c(NA, NA, NA, NA, NA),
p_5 = c(NA, NA, NA, NA, NA)
)