I am searching a way for obtaining a large table from a table with a list of species.
Here I give you an example:
I have something like this
data.frame(survey_id = c("ID_1", "ID_2", "ID_3", "ID_4", "ID_5"),
list_1 = c("A", "A", "A", "B", "A"),
list_2 = c("B", "D", "E", "E", "F"),
list_3 = c("C", "", "", "F", ""))
and I want to obtain this
data.frame(survey_id = c("ID_1", "ID_2", "ID_3", "ID_4", "ID_5"),
A = c(1,1,1,0,1),
B = c(1,0,0,1,0),
C = c(1,0,0,0,0),
D = c(0,1,0,0,0),
E = c(0,0,1,1,0),
F = c(0,0,0,1,1))
Any suggestion using pivot_wider function?
Thanks for your help.
CodePudding user response:
A slightly round-a-bout way: pivot_longer
first, so as you can count by (survey_id, value), and then pivot_wider
.
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(-Survey_ID) %>%
count(Survey_ID, value) %>%
filter(value != "") %>%
pivot_wider(names_from = "value",
values_from = "n") %>%
replace(is.na(.), 0)
Result:
# A tibble: 5 × 7
Survey_ID A B C D E F
<chr> <int> <int> <int> <int> <int> <int>
1 ID_1 1 1 1 0 0 0
2 ID_2 1 0 0 1 0 0
3 ID_3 1 0 0 0 1 0
4 ID_4 0 1 0 0 1 1
5 ID_5 1 0 0 0 0 1
Data df1
:
df1 <- data.frame(survey_id = c("ID_1", "ID_2", "ID_3", "ID_4", "ID_5"),
list_1 = c("A", "A", "A", "B", "A"),
list_2 = c("B", "D", "E", "E", "F"),
list_3 = c("C", "", "", "F", ""))
CodePudding user response:
df %>%
pivot_longer(-survay_id) %>% # first go longer
mutate(n=1) %>% # add a value to each record
select(-name) %>% # drop the name column
filter(value != "") %>% # remove empty cells caused by pivot_longer
# spread(survay_id, n, fill = 0) # a spread solution
pivot_wider(names_from=survay_id, values_from=n, values_fill=0) # a pivot_wider solution
value ID_1 ID_2 ID_3 ID_4 ID_5
1 A 1 1 1 0 1
2 B 1 0 0 1 0
3 C 1 0 0 0 0
4 D 0 1 0 0 0
5 E 0 0 1 1 0
6 F 0 0 0 1 1