I have a table such as
Group Family Nb
1 A 15
2 B 20
3 A 2
3 B 1
3 C 1
4 D 10
4 A 5
5 B 1
5 D 1
And I would like to transform that dataframe such that I have each unique Family
element in columns, and for each Group
, the frequency of the Nb
element, I should then get :
Group A B C D E F
1 1 0 0 0 0 0
2 0 1 0 0 0 0
3 0.5 0.25 0.25 0 0 0
4 0.33 0 0 0.67 0 0
5 0 0.5 0 0.5 0 0
Here is the dput format of the tabel if it can helps :
Family = c("A", "B", "A", "B", "C", "D", "A", "B", "D"),
Nb = c(15L, 20L, 2L, 1L, 1L, 10L, 5L, 1L, 1L)), class = "data.frame", row.names = c(NA,
-9L))
CodePudding user response:
in Base R:
prop.table(xtabs(Nb ~ ., df), 1)
# Family
#Group A B C D
# 1 1.0000000 0.0000000 0.0000000 0.0000000
# 2 0.0000000 1.0000000 0.0000000 0.0000000
# 3 0.5000000 0.2500000 0.2500000 0.0000000
# 4 0.3333333 0.0000000 0.0000000 0.6666667
# 5 0.0000000 0.5000000 0.0000000 0.5000000
If you need it as a data.frame, just wrap the results in as.data.frame.matrix
CodePudding user response:
You can first group_by
the Group
column, then calculate the frequency and finally pivot
the data into a "wide" format.
library(tidyverse)
df %>%
group_by(Group) %>%
mutate(Nb = Nb/sum(Nb)) %>%
pivot_wider(Group, names_from = "Family", values_from = "Nb", values_fill = 0)
# A tibble: 5 × 5
# Groups: Group [5]
Group A B C D
<int> <dbl> <dbl> <dbl> <dbl>
1 1 1 0 0 0
2 2 0 1 0 0
3 3 0.5 0.25 0.25 0
4 4 0.333 0 0 0.667
5 5 0 0.5 0 0.5
CodePudding user response:
Another possible solution:
library(tidyverse)
df %>%
pivot_wider(names_from = Family, values_from = Nb, values_fill = 0) %>%
mutate(aux = rowSums(.[-1]), across(-Group, ~ .x / aux), aux = NULL)
#> # A tibble: 5 × 5
#> Group A B C D
#> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 0 0 0
#> 2 2 0 1 0 0
#> 3 3 0.5 0.25 0.25 0
#> 4 4 0.333 0 0 0.667
#> 5 5 0 0.5 0 0.5