I use google_places
from googleway
package to get a dataframe of places from Google. I am looking for "blood donation in Germany" (in German) https://www.google.de/maps/search/Blutspende in Deutschland/@51.5019637,6.4650438,12z The vignette says that each API query results in 20 locations. https://cran.r-project.org/web/packages/googleway/vignettes/googleway-vignette.html I assume that there should be about 300 blood donation places in Germany. I am trying to build a loop that returns all Google place results to a dataframe for my key term. A similar post can be found here next_page_token not working on second attempt (google_places function)
How can I built my loop such that it returns a dataframe of all Google searches?
library(googleway)
# initialize list
datalist = list()
# start first search
key = "YOUR-KEY"
res <- google_places(search_string = "Blutspende in Deutschland",
key = key)
# store first 20 results
datalist[[1]] <- data.frame(Name = res$results$name,
Place = res$results$formatted_address)
# set next page token
token = res$next_page_token
for(i in 1:10){
# sleep time
Sys.sleep(2)
# next search
res_n <- google_places(search_string = "Blutspende in Deutschland",
page_token = token,
key = key)
# store next results
datalist[[i 1]] <- data.frame(Name = res_n$results$name,
Place = res_n$results$formatted_address)
# set next token again
token <- res_n$next_page_token
# print status
aa = res_n$status
cat(i, aa, '\n')
}
# to dataframe
big_data = do.call(rbind, datalist)
There is a massive amount of duplicates in this search.
library(tidyverse)
big_data %>% distinct() %>% nrow()
For me, I have 54 distinct entries out of 202. I don't know why.
CodePudding user response:
Google Map's place API limits the responses to 60 locations by query, paginated in up to 3 json with 20 places. (See Places API Docs).
To get more than ~60 observations, one easy trick with googleway
is to query by regions/Lands, or even by municipalities. In the next example I will loop through the 16 German Lands/States to get 600 results.
library(tidyverse)
library(googleway)
key <- "your_api_key"
land <- c("Baden-Württemberg", "Bayern", "Berlin", "Brandenburg", "Bremen", "Hamburg", "Hessen", "Mecklenburg-Vorpommern", "Niedersachsen", "Nordrhein-Westfalen", "Rheinland-Pfalz", "Saarland", "Sachsen", "Sachsen-Anhalt", "Schleswig-Holstein", "Thüringen")
queries <- paste0("Blutspende Blutbank in ", land, ", Deutschland")
# A custom loop function for google_places()
google_places_loop <- function(search_string, key, ntimes = 3, page_token = "") {
print(search_string)
iter <- 0
obj_df <- tibble()
while(iter < ntimes & !is.null(page_token)) {
iter <- iter 1
print(iter)
obj_response <- google_places(search_string = search_string, key = key, page_token = page_token,
language = "DE", # Optional, but note that setting language to German might get you a few more locations
)
obj_df_new <- as_tibble(obj_response$results) %>% mutate(iter = iter)
obj_df <- bind_rows(obj_df, obj_df_new)
page_token <- obj_response$next_page_token
if(is.null(page_token) == TRUE) {
print("No more pagination tokens")
Sys.sleep(2)
} else {
Sys.sleep(3)
}
}
obj_df
}
# Finally, we loop through the queries by the custom function.
df_blutspende <- map_df(.x = queries, .f = google_places_loop, key = key)