I have this data frame:
df <- data.frame(id = c(918, 919, 920, 921, 922),
city = c("a", "c", "b", "c", "a"),
mosquitoes = c(9, 13, 8, 25, 10))
What I want to do is to get the number of unique ID values for each city and then create a new dataframe that should looks like:
newdf <- data.frame(city = c("a", "b", "c"),
id = c(2,1,2),
mosquitoes = c(19, 8, 38))
I know how to do half of that using
newdf <- aggregate(mosquitoes ~ city, data = df, sum)
But no matter how I try, I can't get the range for unique values of ID according to the cities that I have. I've been trying
newdf$id <- aggregate(length(id) ~ city, data = df, sum)
And I also tried a loop (because my original data has way more than 3 cities), but only got disaster and can't make it work at all:
x <- unique(df$city)
unique_ID <-
for (x in df$city) {
city = unique(df$city)
mosquitoes = ?
ID = ?
}
This topic was the most similar to mine I could found, but apparently it only works with numeric values. At least I couldn't make it work with my character columns.
Can someone please help me?
CodePudding user response:
You could do:
library(tidyverse)
df <- data.frame(id = c(918, 919, 920, 921, 922),
city = c("a", "c", "b", "c", "a"),
mosquitoes = c(9, 13, 8, 25, 10))
df %>%
group_by(city) %>%
summarise(id = n(), mosquitoes = sum(mosquitoes))
#> # A tibble: 3 x 3
#> city id mosquitoes
#> <chr> <int> <dbl>
#> 1 a 2 19
#> 2 b 1 8
#> 3 c 2 38
Created on 2022-09-05 with reprex v2.0.2