I have a data:
set.seed(51)
df_1 <- data.frame(
nomes = LETTERS[1:100],
filtro1 = sample(x = c("sim", "não"), size = 100, replace = TRUE),
filtro2 = sample(x = c("sim", "não"), size = 100, replace = TRUE),
genero = sample(x = c("masculino", "feminino"), size = 100, replace = TRUE),
groups = sample(x = 1:3, size = 100, replace = TRUE)
)
And this code:
library(dplyr)
df_1 %>%
group_by(groups, genero) %>%
summarise(count = n()) %>%
mutate(percent = count/sum(count)) %>%
filter(count == max(count))
The result is:
# Groups: groups [3]
groups genero count percent
<int> <chr> <int> <dbl>
1 1 feminino 16 0.533
2 2 masculino 19 0.633
3 3 masculino 21 0.525
I would like these categories to be recycled with mutate
. That is, that the maximum values were repeated in their respective groups. See:
df_1 %>%
group_by(groups, genero) %>%
mutate(count = n()) %>% # replace summarise by mutate
mutate(percent = count/sum(count)) %>%
filter(count == max(count))
Doesn't work.
I would like the values to repeat along the new column with mutate
. Like this:
0.533
0.633
0.525
0.533
0.525
0.633
0.525
...
CodePudding user response:
summarise()
automatically drops the last grouping level.mutate()
doesn’t do this, so you have to do so manually with a secondgroup_by()
.- Because you still have multiple rows per group after
mutate()
,sum(count)
won’t give you what you want (the overall n per group). Instead, use another call ton()
.
library(dplyr)
df_1 %>%
group_by(groups, genero) %>%
mutate(count = n()) %>%
group_by(groups) %>%
mutate(percent = count/n()) %>%
filter(count == max(count))
Output:
# A tibble: 58 × 7
# Groups: groups [3]
nomes filtro1 filtro2 genero groups count percent
1 A sim sim feminino 1 16 0.593
2 C sim sim feminino 1 16 0.593
3 E não não feminino 1 16 0.593
4 H sim não feminino 2 20 0.541
5 J não não feminino 3 22 0.611
6 K não não feminino 2 20 0.541
7 M não não feminino 1 16 0.593
8 N não sim feminino 1 16 0.593
9 P não não feminino 2 20 0.541
10 R não não feminino 1 16 0.593
# … with 48 more rows
# ℹ Use `print(n = ...)` to see more rows