A section of dataframe looks like
Streets <- c("Muscow tweede","Muscow NDSM", "kazan Bo", "Kazan Ca")
Hotels<- c(5,9,4,3)
Is there a method to merge Muscow tweede and Muscow ndsm, as well as the two Kazan streets, so that I can find the total number of hotels in the city rather than separate streets?
CodePudding user response:
With dplyr
:
library(dplyr)
df %>% group_by(col=tolower(sub(' .*', '', Streets))) %>%
summarize(Hotels=sum(Hotels))
Output:
col Hotels
<chr> <dbl>
1 kazan 7
2 muscow 14
CodePudding user response:
Another way:
library(dplyr)
library(stringr)
tibble(Streets, Hotels) %>%
mutate(Streets = str_to_title(str_extract(Streets, '\\w '))) %>%
group_by(Streets) %>% summarise(Hotels = sum(Hotels))
# A tibble: 2 x 2
Streets Hotels
<chr> <dbl>
1 Kazan 7
2 Muscow 14
CodePudding user response:
Another way with tapply
-
with(df, tapply(Hotels, tools::toTitleCase(sub('\\s.*', '', Streets)), sum))
# Kazan Muscow
# 7 14
CodePudding user response:
df1$City = stringr::str_to_title(stringr::word(Streets, end = 1))
aggregate(Hotels ~ City, data = df1, sum)
City Hotels
1 Kazan 7
2 Muscow 14
Sample data
df1 <- data.frame(
Streets = c("Muscow tweede","Muscow NDSM", "kazan Bo", "Kazan Ca"),
Hotels = c(5,9,4,3))