I have a dataset like the following
name | city | number |
---|---|---|
A | M | 2 |
A | N | 3 |
A | O | 5 |
A | P | 7 |
B | M | 7 |
B | N | 8 |
B | 0 | 9 |
B | P | 2 |
For each of of the name category, I want to sum the number of M and N value and put it in a new variable. The same goes for O and P value. The dataset should look like the following:
name | city | number |
---|---|---|
A | X | 5 |
A | Y | 12 |
B | X | 15 |
B | Y | 11 |
I'm new in R programming. I have tried to use group by and mutate method but was not successful.
CodePudding user response:
We could modify the values in the column 'city' to 'X', 'Y', and do a group by sum
library(dplyr)
df1 %>%
group_by(name, city = case_when(city %in% c("M", "N") ~ 'X',
city %in% c("O", "P") ~ "Y")) %>%
summarise(number = sum(number), .groups = 'drop')
-output
# A tibble: 4 × 3
name city number
<chr> <chr> <int>
1 A X 5
2 A Y 12
3 B X 15
4 B Y 11
data
df1 <- structure(list(name = c("A", "A", "A", "A", "B", "B", "B", "B"
), city = c("M", "N", "O", "P", "M", "N", "O", "P"), number = c(2L,
3L, 5L, 7L, 7L, 8L, 9L, 2L)), row.names = c(NA, -8L), class = "data.frame")