Summation over multiple rows based on a condition in R-CodePudding

I have a dataset like the following

name	city	number
A	M	2
A	N	3
A	O	5
A	P	7
B	M	7
B	N	8
B	0	9
B	P	2

For each of of the name category, I want to sum the number of M and N value and put it in a new variable. The same goes for O and P value. The dataset should look like the following:

name	city	number
A	X	5
A	Y	12
B	X	15
B	Y	11

I'm new in R programming. I have tried to use group by and mutate method but was not successful.

CodePudding user response：

We could modify the values in the column 'city' to 'X', 'Y', and do a group by sum

library(dplyr)
df1 %>%
   group_by(name, city = case_when(city %in% c("M", "N") ~ 'X',
    city %in% c("O", "P") ~ "Y")) %>%
   summarise(number = sum(number), .groups = 'drop')

-output

# A tibble: 4 × 3
  name  city  number
  <chr> <chr>  <int>
1 A     X          5
2 A     Y         12
3 B     X         15
4 B     Y         11

data

df1 <- structure(list(name = c("A", "A", "A", "A", "B", "B", "B", "B"
), city = c("M", "N", "O", "P", "M", "N", "O", "P"), number = c(2L, 
3L, 5L, 7L, 7L, 8L, 9L, 2L)), row.names = c(NA, -8L), class = "data.frame")